Vol. 27, No. 2 April, 1943 


Journal of Applied Psychology 


EDITED BY: DoNALD G. PATERSON, UNIVERSITY OF MINNESOTA 


Consulting Editors 


Paut S. Acuities, Psychological Corporation; Water V. BinGHAM, A.G.O., War Department; 
Haroip E. Burtt, Ohio State University; Artuur I. Gates, T. C. Columbia University; 
Joun G. Jenxins, University of Maryland; Invinc Lorce, T. C. Columbia University; 
Quinn McCNEMAR, Stanford University; W1LLARD C. OLSON, University of Michigan; 
James P. Porter, Ohio University; Epwarp K. StronG, JR., Stanford University; 
Morris S. VITELES, University of Pennsyluania; JoseruH Zusin, N. Y. Psychiatric Hospital. 





Table of Contents 


The Effects of Benzedrine Sulphate and Caffeine Citrate on the Efficiency er 
College Students: C. D. Frory anv J. GiLBert 


A Factor Analysis of Some Clinical Performance Tests: J. C. Heston 
Distribution of Scores From Revisions of Army Alpha: G. K. BENNETT .... 


The Adaptability Test: A Fifteen Minute Mental Alertness Test for Use in 
Personnel Allocation: J. Tirrin anv C. H. Lawsue, Jr. ............. 152 
Extension of the Minnesota Rate of Manipulation Test: C. E. JurceNsen ... 164 
The Effects of a Second Administration of an Employment Test: 
L. W. Fercuson 170 


Some Comments on “The Prediction of Differential Achievement in a Tech- 
nological College”: A. E. TRAXLER 


Likes, Dislikes and Vocational Interests: R. F. Berpre 

Developing an Industrial Merit Rating Scale: J. E. Zerca 

A Note on the Experimental Study of the Appraisal Interview: 

E. S. SCHNEIDMAN 196 


News and Notes 
Book Reviews 
New Books, Monographs, and Pamphlets 





Published Bi-monthly by The American Psychological Association, Inc. 
With the Cooperation of The American Association of Applied Psychology 
Prince and Lemon Sts., Lancaster, Pa., and Northwestern University, Evanston, Illinois 


Entered as second-class matter, May 7, 1935, at the post office at Lancaster, Pa., under the Act of March 3, 1879. 
Copyright, 1943, by The American Psychological Association, Inc. 








PUBLICATIONS OF 


THE AMERICAN PSYCHOLOGICAL ASSOCIATION 
Witrarp L, VALENTINE, Business Manager 


PsycHOLOGICAL REVIEW 
Hersert S. Lancrerp, Editor 
Princeton University 


Contains original contributions only, appears bi-monthly, January, March, May, July, 
Septeinber, and November, the six numbers comprising a volume of about 540 pages. 
Subscription: $5.50 (Foreign, $5.75). Single copies, $1.00. 


PsycuHoLocicaL BULLETIN 
Joun E. Anverson, Editor 
University of Minnesota 


Contains critical reviews of books and articles, psychological news and notes, university 
notices, and announcements. Appears monthly (10 issues), the annual volume comprising 
about 665 pages. Special issues of the BuLLETIN consist of general reviews of recent 
work in some department of psychology. 


Subscription: $7.00 (Foreign, $7.25). Single copies, 75c. 


JourNAL oF EXPERIMENTAL PsycHOLOGY 


S. W. Fernpercer, Editor, on Leave, Francis W. Irwin, Acting Editor 
University of Pennsylvania 
Contains original contributions of an experimental character. Appears monthly (since 
January, 1937), two volumes per year, each volume of six numbers containing about 520 — 


es. 
Subscription: $14.00 ($7.00 per volume; Foreign, $7.25). Single copies, $1.25. 


PsycHOLOGICAL ABSTRACTS 
Wa ter S. Hunter, Editor 
Brown University 


Appears monthly, the twelve numbers and an index supplement making a volume of about 
700 pages. The journal is devoted to the publication of non-critical abstracts of the 
world’s literature in psycho and closely related subjects. 

Subscription: $7.00 (Foreign, $7.25). Single copies, 75c. 


PsycHOLOGICAL MonoGRAPHS 
Joun F. Dasutett, Editor 
University of North Carolina 
Consist of longer researches or treatises or collections of laboratory studies which it is 
important to publish promptly and as units. The price of single numbers varies accord- 


ing to their size. ee ere er One tee getiered into 
volumes of about 500 pag 


Subscription: $6.00 per volume (Foreign, $6.30). 
JourNAL oF ABNORMAL AND SocraL PsycHoLocy 


Gorpon W. At.port, Editor 
Harvard University 


Appears quarter mage vce April, July, October, the four numbers comprising a volume 
5.2 mew ournal contains original contributions in the field of abnormal and 
percuiting, nevi reviews, notes and 
ption: $5.00 (Foreign, $5.25). Single copies, $1.50. 


Neila or AppLiep PsycHoLocy 


Donatp G. Paterson, Editor 
University of Minnesota 
Covers the applications of psychology in business, industry, education, government, etc. 
Appears bi-monthly, rary, April, June, August, October, and December. 
ubscription: £6.00. Single copies, $1.25, 
Subscriptions, orders, and business communications should be sent to 


THE AMERICAN PSYCHOLOGICAL ASSOCIATION, INC. 


NoRTHWESTERN Untversity, Evanston, ILLINoIs 














Journal of Applied Psychology 


Vol. 27, No. 2 April, 1943 














The Effects of Benzedrine Sulphate and Caffeine Citrate 
on the Efficiency of College Students 


Charles D. Flory and Jane Gilbert 
Lawrence College 


College students who frequently get themselves into positions where 
they must “burn the candle at both ends” have great interest in stimu- 
lants or “‘pep pills” that promise to help them over their academic hurdles. 
Seldom does a student trouble to seek competent advice concerning the 
efficacy of stimulants, and often his advisor would be unable to give him 
accurate information about the drug in question. It is the purpose of this 
study to evaluate the effects of benzedrine sulphate and caffeine citrate 


on speed of action, reading rate, reading comprehension, and thinking 
ability. 


Early Lawrence Experiments 


For four years the senior author has used benzedrine sulphate as a 
stimulant in one of the experiments in the general psychology laboratory 
at Lawrence College. Conditions in the early experiments were not 
sufficiently controlled to warrant separate publication, but some observa- 
tions have direct bearing on the problem under consideration. One stu- 
dent volunteered the information that he had been using the drug with 
success, contemplated using it again, and would be glad to submit his 
subjective reaction during an approaching examination period. His 
observations were as follows: 


I have been drinking a lot of coffee for the last six days in an effort 
to stay awake late at night. At first it worked quite satisfactorily but 
now it no longer has the desired effect. In the last six days I have had 
25% hours sleep instead of the usual 40. Last night I stayed awake with 
the aid of coffee and determination until four A.M. I slept two hours 
until six A.M. I studied from six until seven-thirty, ate a light breakfast, 
at which I had two cups of coffee. I loafed from eight-fifteen until 
eight-forty-five, when I took one benzedrine sulphate pill prior to writing 
my exam. I felt no noticeable effects until about nine-thirty. I began 
to get an anxious feeling in the upper part of my stomach. My mind 
worked very rapidly and seemed to be able to consider one idea after 
another with great speed. My hands were sweating. I felt impatient 
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at the slowness of my writing, but I was writing at top speed. I wasn’t 
confused. I felt “hot’’ intellectually, as though I were at my very keen- 
est. My mind was racing, yet I felt that I had complete control of the 
sequence of thought and was capable of ordinary thinking. It seemed 
that my memory was clearer and working better than ever. About 
ten-thirty I reached the climax of stimulation. I felt happy, powerful, 
quick,—every faculty sharpened. I took a deep breath and resisted an 
impulse to laugh and tell my neighbor how good I felt. Instead I said 
to myself, “I feel swell!” Gradually, but not noticeably, I began to lose 
my excited feeling. By twelve noon, I could feel the contrast. 


Intimate acquaintance with this student and knowledge of his per- 
formance indicate that his sense of well being was more apparent than 
real. His achievement did not agree with the reported efficiency, but 
his observation led to more carefully controlled experimentation. 

The first experiment in the Lawrence College laboratory was made 
upon 94 subjects divided into experimental and control groups. Tests 
were given 30 minutes after the experimental group had taken 15 mg. of 
benzedrine sulphate and the control group had taken a capsule containing 
colored sugar. Age, sex, and intelligence were controlled in pairing the 
groups. The results are presented in Table 1. 








Table 1 

The Effect of Benzedrine Sulphate on Reading, Multiplying, and Analogies Scores 
Benzedrine Sugar 
Group Group 
Tests Means Means 
Meee On OUI IID a5 05. on oo Saas ce ds to Be doves 458.0 453.0 

Seconds required to multiply six problems of four-place 

Or Aaa Bh at gr i tle a ai a 318.0 368.0 
Number of analogies attempted...................... 24.4 22.9 
Number of analogies correct................e2eeeees 19.2 18.3 
Percentage of accuracy for analogies................. 79.0 82.0 





These results fail to show any significant effect of benzedrine sulphate 
on group means. The slight advantage of the benzedrine group in read- 
ing rate and analogies scores is nullified by the superiority of the control 
group in multiplication speed. 

The subjective reports of the students in these two groups are both 
interesting and suggestive. The following excerpts are quoted directly 
from the benzedrine group: 


1. “I began to feel dull, found it difficult to adjust my thoughts to 
one thing. I was expecting a period of clear thinking, but found a great 
deal of confusion. The effect was deadening rather than stimulating. 
I became convinced that I did not have a stimulant.” 
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2. “In the course of 30 to 50 minutes the stimulant began to take 
effect. The subject was slowly less able to concentrate. After the lab 
period the subject spent the next two hours in bed. He attended a 
musical festival in the evening and about nine o’clock the subject’s belly 
began to ache like hell. The next morning the ache was removed from 
the belly to my head. The ache was dull and throbbing. It felt as if 
the subject had been on a three-day drunk. A decided letdown was 
experienced and the subject felt like a wreck for the past two days.” 

3. “I felt exceptionally sleepy and quite dopey. After leaving the 
lab I felt nauseated. I became quite talkative and talked continuously. 

I acted very silly and gay. During the evening I studied. I never 
studied so well or accomplished more in my life. I translated 45 pages 
of French in 45 minutes. I could not go to sleep all night long. I went 
to bed about 8 A.M. and against my will I began to cry. After crying I 
must have felt better, because I went to sleep. I do not wish to repeat 
this experience again.” 

4. “T seemed to think faster than I could keep up with myself. I felt 
foggy and became dizzy after smoking. I made many mistakes, for I 
was unusually far ahead of myself. In general the stimulant seemed to 
slow me down and make me nervous.” 

5. “I felt light-headed and numb as though I had taken an anesthetic. 
There was a queer prickling feeling in my skin and a feeling of restlessness. 
When I retired my body felt as if it were asleep, but my brain would not 
stop working. I finally got to sleep about 4:30 A.M. Two days later 
I experienced a tremendous letdown. The first evening after the lab I 
left the table because the conversation of my friends seemed so silly and 
irritating to me.” 

6. “I experienced an increase in speed but little increase in accuracy 
and efficiency. Iwas dopey andslap-happy. I felt dizzy and had a kind 
of buzzing in my head.” 

7. “I felt normal during the lab and during the evening. I noticed 
no physical change. A friend called me about 9:30 and told me the 
effect the stimulant had on her. I had almost convinced myself that I 
had had a stimulant, but then dismissed the idea with the explanation 
that it was only pure imagination.” 

8. “At 4:30 my mind became a sort of a blank and I stumbled over 
everything that I wanted tosay. About 5:30 I grew nervous and jumpy. 
ree seemed far away from me. The following day I was very 
oggy.’ 


The subjective accounts show evidence of apparent speed, aches, 
dizziness, dullness, irritability, moodiness, restlessness, wakefulness, clear- 
headedness, and lack of concentration. If the changes noted are physio- 
logical rather than psychological, then benzedrine has markedly different 
results on individuals who are approximately equal in age and intelligence. 
Severe results were reported more frequently by women than by men. 

Some excerpts from the laboratory reports of the sugar group are 
given below. 


1. “I felt cross and moody all evening, although at times I was rather 
slap-happy.” 
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2. “‘After dinner I became very sleepy and could hardly drag myself 
about. My mind and body just didn’t feel as though they were co- 
ordinating. About nine o’clock I suddenly felt very gay and stimulated. 
This condition continued until I retired at twelve. My body was tired 
but my mind felt keen. I usually go to sleep immediately, but on this 
night I had a great deal of difficulty getting settled down after going to 
bed. I felt normal next day.” 

3. “I started yawning and feeling sleepy about twenty minutes after 
taking the pill. I felt as though I just whizzed through the problems 
but found it took me a considerable length of time. As I went on I kept 
getting slower and slower. The next day I retested myself on these 
problems and found that I got faster and faster as time went on. My 
reading rate was slower than average, and I could scarcely remember 
what Ihadread. After leaving the lab my knees felt weak and my nerves 
tense. This feeling continued until late the next afternoon. Around 
dinner time my nerves were tense and I felt like screaming. After ten 
o’clock this terrible tenseness subsided, and my eyes were so heavy I could 
hardly hold them open. I was so sleepy I couldn’t study and couldn’t 
concentrate. I was tired and went to bed about nine-thirty. The next 
morning I awoke feeling half dead and was still yawning. I was also 
rather sick. I was more sleepy in class than usual and could not take 
intelligible notes. I began to feel nauseated in class, but seemed to feel 
normal about noon. I became quite sick again that evening, but woke 
up the following day feeling quite normal. This seems to prove to me 
that there was either something in that capsule or I’m crazy.” 


The results with colored sugar capsules as the stimulant show clearly 
that the psychological effect of suggestion plays no small role in the sub- 
jective reactions. Nervousness, fatigue, exhilaration, tenseness, wake- 
fulness, aches, and exhaustion seemed to result from the mere fact that 
the subjects were expecting something to happen. All were told that 
each pill contained a stimulant but that the effects might show wide 
individual variations. 


Other Experimental Findings 


Numerous experiments have been conducted with animals to deter- 
mine more precisely the effect of benzedrine sulphate on the central 
nervous system. Alles (1) reported that administration of this drug to 
dogs caused increased pulse amplitude and sometimes rate, dependent 
upon reflex slowing caused by increased blood pressure. These changes 
were the result of cardiac output and the vaso-constrictor effect of the 
drug. Brown and Searle (3, 19) found that injections of benzedrine 
sulphate produced a marked increase in the activity of white rats. 
Wentick (25) confirmed this finding and reported that subcutaneous in- 
jections of benzedrine into white albino rats increased their motor 
responses 142 per cent. 
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Studies of psychotic patients have also yielded interesting data con- 
cerning the psychological effect of benzedrine as well as its value in 
stimulating the central nervous system. It is also reported to be effective 
in the treatment of narcolepsy (23, 24), although these results are usually 
temporary. Wilbur, Maclean, and Allen (26) found that benzedrine is 
more effective in alleviating less severe cases of depression. Their study 
of 100 merital patients at the Mayo Clinic shows that those receiving 
10-20 mg. of benzedrine before breakfast and frequently at noon showed 
definite changes, while the symptoms of those receiving placebo tablets 
did not change noticeably. However, these effects were all temporary 
and therefore probably did not fundamentally or permanently alter 
psychotic or neurotic disorders. 

Wooley (27) reported that administration of 10-20 mg. of benzedrine 
produced only temporary improvement in depressed patients. His con- 
trol group showed a wide diversity of responses, including pleasurable 
stimulation, drowsiness, and difficulty in concentrating. Shube, Mc- 
Manamy, Trapp, and Myerson (18) administered 10-30 mg. of benzedrine 
daily to 80 psychotic patients for 30 days and noted no improvement in 
any of the cases. In fact, they noted temporary accentuation of the 
psychosis in 15 cases. 

In another study, Davidoff and Reifenstein (7, 8) reported that 10-30 
mg. of benzedrine is more stimulating to normal persons than to depressed 
patients in whom it produced increased motor and speech activity and 
general efficiency rather than elevation of mood, Myerson (15) has 
shown that normal persons suffering from fatigue due to insufficient sleep 
received immediate pleasant relief after taking 5-20 mg. of benzedrine 
sulphate, although sleep was impaired when the drug was administered 
late in the day. 

Molitch and Eccles (14) found that intelligence test scores of boys 
taking benzedrine sulphate improved more than the scores of the control 
group. The average percentage changes from the median score of the 
control tests were 6.71, 12.00, and 15.33 for the respective groups taking 
10, 20, and 30 mg. of benzedrine. Corresponding changes for the control 
groups were 2.71, 8.00 and 12.83. 

A similar study by Sargent and Blackburn (17) on 25 mental patients 
showed that the average intelligence test scores of the benzedrine group 
improved 3.9 I.Q. points or 8.7 per cent, while the control group remained 
the same. They attributed this increase to subjective increases in con- 
fidence and mental alertness rather than an increase in intelligence itself. 
The effects were greater in patients having emotional disturbances than 
in cases of more profound total personality dissociation. Barmack (2) 
also reported that development of more favorable attitudes rather than 
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stimulation of factors involved in adding, caused an increase in rate but 
not in accuracy in addition problems. The divergence in rate of work 
was more apparent at the end of a two-hour period. 

Nathanson (16) concluded from his studies that 10-20 mg. of benze- 
drine produces a definite stimulation of the central nervous system result- 
ing in euphoria, lessened fatigue, and increased energy and capacity for 
work. Therefore, benzedrine sulphate should help an individual prepare 
for activities requiring an unusual expenditure of mental energy. 

Similar results were reported by Gwynn and Yater (9). While none 
of their subjects suffered any serious ill effects, use of benzedrine was not 
recommended except as an emergency measure. 

In an early study, Hollingworth (10) reported that administration of 
1-3 grains of caffeine improved typing speed and decreased errors simul- 
taneously, while a dose of 4 to 6 grains produced retardation. Stimulat- 
ing effects usually occurred an hour after the drug was taken. Shilling 
(20) reported that the mean reaction time of his subjects was 6 to 8 per 
cent slower 50 minutes after a dose of caffeine than at the beginning of 
the experiment. Cheney (5, 6) found that 5 mg. of caffeine per kilogram 
of body weight reduced reaction time one and one-half to three hours 
after dosage, but no effect was noticeable after 24 hours. These results 
were determined by reading rates and a key-tapping test. However, 
Horst (11, 12) asserted that the effects of 2 mg. of caffeine per kilogram 
of body weight were still noticeable after 24 hours in some subjects. 

Cattell (4) found that 4 grains of caffeine were detrimental to intelli- 
gent and associative performance. On the contrary, results on a memory 
test showed some improvement although this improvement was not sta- 
tistically significant. There was some indication that men are more 
susceptible to caffeine than women and that younger people are more 
susceptible than older people, but these inferences have not been con- 
firmed. Hull’s findings corroborated these results (13). He found that 
5 grains of caffeine retarded learning rate from 2 to 3 per cent, but these 
differences were not statistically reliable. 

Switzer (22) administered 5 grains of caffeine citrate to 20 subjects 
and found that four hours later conditioned reactions were augmented 
and extinction was more difficult. In addition, the latent period was 
shortened under the influence of caffeine. Switzer (21) summarizes the 
effects of caffeine on human behavior, reporting that moderately large 
doses stimulate reaction time, mental activity, and associative perform- 

ance although it may depress individual functions. It also improves 
perception. The effects of caffeine dosage may last from six to seven 
hours. However, its effects on human learning are negligible except 
possibly as it delays mental fatigue. 
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The foregoing summary shows that the evidence concerning the effects 
of benzedrine sulphate and caffeine citrate is still lacking in conclusiveness. 


The present study seeks to contribute to the cumulating evidence in 
this field. 


Procedure 


The subjects in this experiment consisted of 129 college students en- 
rolled in general psychology. The experiment was conducted at one of 
the regularly scheduled laboratory periods. The subjects were divided 
into three groups equated as to intelligence and sex. Percentiles on the 
A.C.E. Psychological Examination were used in matching the groups. 
The mean percentile rank in each group was 63.3, 63.2, and 62.4 respec- 
tively. Each subject in Group I took a capsule containing 15 mg. of 
benzedrine sulphate. The subjects in Group II were given 5 grains of 
caffeine citrate, while those in Group III received a pill containing sugar 
of milk. All the capsules were identical in appearance and size. The 
contents of the pills were unknown to all subjects. An attempt was made 
to have each subject believe that all pills contained some type of drug. 
Each subject was instructed to note any physical symptoms that de- 
veloped subsequently as well as changes in his feelings or in his ability 
to concentrate. Any student who felt that a stimulant might be detri- 
mental to his health was excused from the experiment. The experiment 
was planned with the approval of the college physician. 

Speed of action was measured by asking each subject to tap on a 
Veeder counter as many times as possible in a 10-second period. Two 
trials counted as one test. The first test was made immediately after the 
pills had been taken, before the drugs could have had any effect. Follow- 
ing the tapping test, reading speed and comprehension were measured by 
the Booker Reading Test, Form 1.! 

Forty-five minutes after the opening of the laboratory period a second 
tapping test was made. This test was used chiefly to keep the subject 
busy since comparisons were made only between the first and third 
tapping tests. 

Thinking ability was measured by a multiple-choice vocabulary test 
of 70 items selected from the Booker Vocabulary Tests, Forms A and D,? 
and a 100-item analogies test selected from Freeman and Flory’s Analogies 
Test.’ These tests measured speed of thinking rather than power since 

1Ivan A. Booker. A Test of Achievement in Silent Reading. Mimeographed. 
Used by permission of the author. 

?Ivan A. Booker. Test of Reading Vocabulary. Mimeographed. Used by per- 
mission of the author. 

*Frank N. Freeman and Charles D. Flory. Growth in Intellectual Ability as 
Measured by Repeated Tests. (Monograph No. 2, Vol. II. Soc. for Res. in Child 
Dev., Washington, D. C., 1937), pp. 113. 
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the time limits were short enough that neither test could be finished by 
any subject. 

A third tapping test was then given, followed by Form 4 of the Booker 
Reading Test. The entire experiment was completed in the laboratory 
period of two hours. The work was planned to keep all subjects working 
at top speed most of the time throughout the two-hour period. 














Results 


Student opinion under non-experimental conditions suggests the hy- 
pothesis that quickness and dexterity increase following the administra- 
tion of benzedrine sulphate. The effect of benzedrine and caffeine as 
measured by the tapping tests is presented in Table 2. Groups I and II, 


Table 2 
Speed of Action on a Tapping Test for Three Groups of Subjects 








Groups 


I Il Iil 
Benzedrine Caffeine Sugar of 
Sulphate Citrate Milk 



























Bae RS TNR cin ohh wk 131.5 + 1.47 128.2 + 1.19 
Es hs cedewens 11.60 






128.0 + 1.91 
18.57 













136.7 
16.45 


BO Oe. Bc bans ce. ves ; . 136.8 + 1.12 + 1.69 


Ea the sins wii gts . 10.85 














Gains in tapping............ 6.75 + 2.21 8.60 + 1.62 8.74 + 2.55 





which differed insignificantly at the start, 3.5 + 2.41, were closer together 
at the end of the experiment than in the beginning. All groups gained in 
speed of action but the group without drugs gained slightly more than 
either the benzedrine or the caffeine groups. The benzedrine group be- 
came somewhat more variable during the experiment, while the caffeine 
group became slightly less variable. But the sugar group was most 
variable of all both at the beginning and at the end of the experiment. 
If these drugs do stimulate motor speed, then the psychological effect of 
suggestion is as stimulating as either benzedrine or caffeine for these sub- 
jects. It is possible, of course, that the gain in all groups was merely the 
result of practice. 

Reading rate on the first and second tests is shown in Table 3. None 
of the groups differed significantly in reading speed either at the beginning 
or at the end of the experiments, but the amount of gain was related 
inversely to initial speed. All groups gained slightly on the second test 
but the caffeine group in which the gain was 44 + 15.55 words does not 
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Table 3 


Reading Rate in Words per Minute for the Three Groups 





Groups 





I 
Benzedrine 
Sulphate 


II 
Caffeine 
Citrate 


Ill 
Sugar of 
Milk 





Form 1: 453 + 11.62 


439 + 9.98 
97 


474 + 10.23 
99.5 





Form 4: 478 + 13.16 


483 + 11.93 
116 


479 + 12.96 
126 





25 + 17.5 


44 + 15.55 


5 + 16.48 





show a statistically reliable increase. There is a suggestion that both 
drugs increased the speed of reading slightly, but the small gains with 
large probable errors shakes any confidence in such a conclusion. 

Some experimenters have suggested that speed of action increases with 


drugs but the accuracy of work decreases. 


The reading comprehension 


test was used following the reading rate test to check on this possibility. 
The results of the two comprehension tests are presented in Table 4. 


Table 4 
Reading Comprehension Scores for the Three Groups 





Groups 





I 
Benzedrine 
Sulphate 


II 
Caffeine 
Citrate 


Ill 
Sugar of 
Milk 





Form 1: 8.85 + 1.10 


8.15 + 1.21 
1.80 


6.65 + 1.20 
1.78 





Form 4: 


7.56 + .78 
1.15 


7.53 + 1.10 
1.63 





— .92 + 1.39 — .59 + 1.43 88 + 1.62 





Each comprehension test consisted of 10 items. The difference in com- 
prehension scores of 2.20 + 1.29 between Groups I and III is not sta- 
tistically significant. The changes in the comprehension scores showing 
a slight decline for the drug groups does seem to fit the hypothesis that 
stimulating drugs decrease the accuracy of one’s work. The amount of 
the change is, however, too small to support such a conclusion. It is 
possible that variations in individual results when averaged in such large 
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groups mask the significance of the changes that actually occur. The 
decreased sigmas in each group on the second comprehension test tend to 
discount the likelihood of such a factor. 

One of the major claims of students who use drugs in emergencies is 
that they are much more alert and clear-headed after taking the drugs. 
Such a possibility was investigated by the use of a vocabulary and an 
analogies test as measures of sneed of thinking. The results of these 
two tests are summarized in Table 5. 


Table 5 
Vocabulary and Analogies Scores for the Three Groups 





Groups 


I II Ill 
Benzedrine Caffeine Sugar of 
Sulphate Citrate Milk 
































Vocabulary: Mean............... 33.6 + 1.31 27.9 + 1.28 30.1 + 1.18 
NS okie 6x ewnies sab 12.4 11.5 








Mmm TN sg os cove vcosens 34.54 .95 35.0 + .72 34.64 .91 
ESS oe ae ‘ 7.00 8.85 








The benzedrine group made the highest mean score on the vocabu- 
lary test, 33.6 + 1.31, and the caffeine group the lowest mean score, 
27.9 + 1.28. The difference between these groups is 5.7 + 1.83 and 
therefore only about three times the probable error of the difference. Al- 
though this difference is not clearly significant, it does suggest a stimulat- 
ing effect for benzedrine sulphate and a possible loss in thinking ability 
for caffeine citrate if the analogies test had not been used as a check. 
Means of 34.5 + .95, 35.0 + .72, and 34.6 + .91, respectively, show that 
there was no difference in the ability of the groups in handling analogies. 
These results fail to support the student opinion that either benzedrine 
or caffeine are beneficial to speed of thinking as measured by vocabulary 
or analogies tests. It is, of course, possible that there were wide indi- 
vidual differences in the results with the drugs, but the sigmas of 12.7, 
12.4, and 11.5 for the three groups do not fit an hypothesis of a significant 
increase in variability when drugs are used. The sigmas for the analogies 
test are likewise quite similar, being 9.25, 7.00, and 8.85. 

The psychological effect of suggestion in such an experiment as this 
has been discussed earlier, but the subjective reactions of the benzedrine 
and caffeine groups in the present experiment give some additional insight 
into the problem. The reader should remember that the subjects in this 
particular experiment were probably extremely cautious, since they had 











Effects of Benzedrine and Caffeine on College Students 131 


been told by their fellow students who had served as subjects in earlier 
experiments that some of the pills contained no stimulant. The experi- 
menters, on the other hand, tried to make each subject feel that a real 
stimulant had been given. The subjective observations as given in the 
individual laboratory reports are summarized in Table 6. The sugar 


Table 6 
Subjective Reports of the Effect of the Pills on Student Efficiency 





Frequency Frequency 
for the for the 
Benzedrine Caffeine 
Group 





S 


Slightly stimulating 

Significant increase in efficiency 
Emotional effect impaired efficiency 
Sedative or decrease in efficiency 
Thought it was psychological 

Not sure that there was a stimulant 


on oO 


4 
1 
2 
2 
1 
1 
0 
0 
0 
0 
0 
1 
1 
3 


J Bn ee 


cs 


4 





group has been omitted since the reports were almost wholly negative. 
The subjects were instructed to note any changes or effects over a 24-hour 
period and to watch particularly for any noticeable letdown. 

The results in Table 6 seem to show that the subjects were on their 
guard lest they be led to believe they had been given a stimulant when 
none was present. An overwhelming majority felt that they had received 
no stimulant, or that if they had, the stimulant did not affect their 
efficiency. Two of the subjects, both women, who had the benzedrine 
became hysterical, one before leaving the laboratory and the other about 
five hours later. The first of these subjects reported that she thought 
the effect was psychological, and the other said that she had a nervous 
feeling during the laboratory period. Such extreme results suggested the 
possibility that these drugs might be more upsetting to the women who 
served as subjects in the experiment than to the men. 

In the benzedrine group 75 per cent of the 16 men reported no effect 
and the four who noticed some change were not sure or felt only mildly 
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stimulated. Only 15 of the 27 women, or 56 per cent, reported no reaction 
to the benzedrine sulphate. As previously mentioned, the reactions ob- 
served ranged from hysteria through emotional disturbances and nervous- 
ness to wakefulness and mild stimulation. It would appear from these 
reports that benzedrine has more effect on women than on men and that 
the results produced are more variable among the women. 

Caffeine citrate, on the other hand, produced no effect in 70 per cent 
of the women and 69 per cent of the men, showing no sex difference in 
the results. Some readers might feel that the subjects were conditioned 
to the caffeine through its widespread use as a beverage. However, 
among the subjects used in this experiment very few use coffee, and many 
never drink it. The experimenters failed to obtain the number who used 
coffee regularly or at all. A repetition of a similar experiment in 1941-42 
revealed that less than 5 per cent of Lawrence College students have more 
than a casual experience with coffee. The Director of Dormitories has 
confirmed this figure from her records of the use of coffee at meals over 
a period of years. Therefore, it appears that conditioning was not an 
important factor affecting the results of the caffeine citrate group. 








Conclusions 


The experimental evidence to date concerning the effect of benzedrine 
sulphate and caffeine citrate on human efficiency is conflicting. The 
data in this investigation cannot be considered as solving the problem 
with finality, but several suggestive conclusions for college students can 
be drawa. 

1. When college students are unaware of the contents of the pills 
administered and when they are told that each pill should be stimulating, 
the non-drugged group seems to improve practically as much as the 
benzedrine and the caffeine groups. 

2. Unless the subjects are extremely wary, the non-drugged subjects 
will report serious effects ranging from extreme irritability to marked 
lassitude. 

3. Benzedrine sulphate appears to produce subjective reactions that 
can be detected more readily than the effects of caffeine citrate, especially 
among women students. 

4. There is no evidence from the results to indicate that students in 
general can improve their efficiency by the use of either benzedrine sul- 
phate or caffeine citrate. There are probably individual differences, but 
the non-drugged group showed as much variability as the drugged groups. 

5. There is a suggestion from the findings of this study that the 
euphoria reported by many students who have used benzedrine sulphate 
in emergencies has little relation to the level of their efficiency. Reported 
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clearness of thinking and rapidity of work are not substantiated by the 
group results. 
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A Factor Analysis of Some Clinical Performance Tests 


Joseph C. Heston 


DePauw University 


Much of the recent surge of factorial analysis has been directed at 
the organization of human abilities by analyses based on data collected 
through the use of paper-and-pencil group tests. The experimental bat- 
teries thus used have contained tests largely unfamiliar to the clinician 
who is interested in the factorial composition of more frequently used 
psychometric tools. Some light on this “‘applied” side of the issue is 
found in the application of factorial techniques to the most widely ac- 
cepted clinical test—the Stanford-Binet scale; Wright (34) and McNemar 
(16) have contributed studies of this type. However clinicians have long 
used a number of non-verbal or performance tests as supplementary to 
those of the Binet variety, with but little more than arbitrary judgment 
to indicate what these performance tests really measure. Surely a care- 
fully planned analysis of a battery of representative performance tests 
will be worthwhile, both from the standpoint of ability theories and as 
an evaluation of the tests themselves. It has been with this orientation 
that the following investigation has been proposed and executed. 


Historical 


Before describing the present investigation, it should be profitable to 
review briefly some of the factorial studies that have touched upon the 
field of non-verbal tests. This summary will afford opportunity to 
indicate the scope of previous research and at the same time allow com- 
parison of the present findings with those indicated in these earlier studies. 

Murphy (20) used 143 ninth-grade boys as her subjects, giving them 
a battery including the complete Army Beta scale, parts from other tests 
of mechanical aptitude, and some verbal tests. She concluded that 
mechanical ability is not unitary, but is rather a complex of two factors— 
one calling for speed of hand and eye coordination and the other dependent 
upon mental manipulation of spatial relations. An extensive battery of 
performance tests was given by Morris (18) to 56 boys, all close to nine 
and one-half years old. His battery included the complete Pintner- 
Paterson scale, several formboards, the Porteus mazes, block designs, and 
several other tests. He found three factors of significance and labeled 
them: Visualizing, Perceptual Speed, and Induction. 
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Price (23) gave 85 British university men a battery containing three 
verbal tests and some common performance tests. He found two factors 
in each of two analyses of his data, one by the Spearman-Holzinger bi- 
factor method and the other by the Thurstone centroid method. His 
interpretation of the latter analysis errs, however, because he failed to 
rotate his centroid factors. If properly rotated, his data reveal one factor 
involving the verbal tests and two of the performance tests, while all the 
other performance tests cluster together along the second factor. In an 
investigation more nearly paralleling the present study, Moffie (17) used 
110 Pennsylvania State College men as subjects. His experimental prob- 
lem was different in that he put some of Thurstone’s Primary Mental 
Abilities tests in his battery and then constructed some non-verbal tests 
which he hoped would measure the same factors. There were also some 
clinical performance tests placed in the battery, giving him 19 test vari- 
ables altogether. Three factors were identified: Space, Perceptual Speed, 
and Induction. Interestingly enough, few of his own performance tests 
and scarcely any of the clinical tests had significant loadings on any factor 
but Space. Thus, outside of the Thurstone tests in the battery, the last 
two factors were very weak indeed. 

An important recent contribution to mental testing is MacMeeken’s 
1935-36 survey of the verbal and non-verbal abilities of representative 
Scottish children (13). Thomson (26) has taken her data and subjected 
them to Thurstone centroid analyses. The tests included the 1916 
Stanford-Binet, Seguin formboard, manikin, Stutsman picture test, Red 
Riding Hood test, Healy PC II, Knox cubes, cube construction, and 
Kohs block designs. Analyses were done for the data from each sex 
separately, with over 400 children (all born in 1926) in each group. Four 
factors were found in each case; however, even two alternative sets of 
rotations for each failed to reach complete simple structure. Thomson’s 
conclusion was: ‘But it seems also pretty clear that the only two factors 
which are at all certain are the general factor—possibly but not necessarily 
to identified with ‘‘g’’—and the rhythm or speed factor linking tests 1, 
2, and 3.” (These were the time scores on the Seguin board, manikin, 
and Stutsman picture test.) The Binet IQ was included in his matrix 
and had a high loading on the so-called general factor and zero on the 
other. 

Seashore and his associates (24) used 50 college men as subjects for 
21 individual motor tests, including simple reaciion-times, tapping and 
related actions, Stanford motor skills unit, postural sway, and mechanical 
ability tests. Centroid analysis revealed six factors; orthogonal axes 
were not satisfactory, so the final rotations were oblique. The factors 
were described thus: 
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. Speed of simple reaction. 
. Finger, hand, and forearm speed—in restricted oscillatory 
movements. 
. Forearm and hand speed—oscillatory movements of moderate 
extent. 
IV. Steadiness or precision. 
V. Manipulation of spatial relations. 
VI. A residual factor. 


Ninety-one cotton mill fixers were given a battery yielding 37 variables 
for analysis by Harrell (7). The battery included: Minnesota battery of 
mechanical ability tests, seven of the MacQuarrie tests, O’Connor wiggly- 
blocks, Stenquist picture test, some of Thurstone’s tests, pin-boards, and 
three verbal tests. Harrell found five factors and rotated to oblique 
positions, labelling them: 


I. Perception (discrimination). 

II. Verbal (amount of education and the verbal tests). 
III. Youth and Inexperience (shop ratings and records). 
IV. Manual dexterity. 

V. Spatial. 


The mechanical ability tests contained principally Factors I and V. An 
interesting secondary analysis was done by treating the cosines of the 
oblique primary axes as r’s and factoring again by the centroid method. 
This produced two factors which he describes as a “g” factor and a 
common speed factor. 


Method of the Present Investigation 
The Subjects 


In choosing the subjects for the experimental population, homogeneity 
was the factor of prime emphasis. This feature is desirable because the 
greater the homogeneity of the group, the more certain one can be that 
the variances within the resultant data are functionally significant. It 
was decided that homogeneity could best be secured by limiting the sub- 
jects to one sex, one race, a common age range, the same amount of 
educational experience, and membership in an educational category not 
too specialized in nature. The group finally selected, therefore, were all 
white male college students, who were in their first quarter’s residence 
(Autumn 1940) at the Ohio State University, divided quite evenly in 
membership among three of the most general types of curricula: liberal 
arts, education, and commerce. To secure a sampling as nearly chance 
as possible in other characteristics (scholastic aptitude, etc.), the testing 
program was set up so the subjects were drawn by appointment from 
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elementary classes in psychology and given laboratory credit for the time 
spent in taking the tests. In all 113 subjects were thus secured, with a 
mean age of 19.42 years and a SD of 1.65 years. That the men were 
representative students is shown by their Autumn Quarter grade average 
of 2.107 (where A = 4, B = 3, C = 2, and D = 1) with a SD of 0.913. 


The Test Battery 


The tests selected for the final battery were subjected to scrutiny on 
several important points before they were ultimately included. The 
criteria governing this selection may be most conveniently presented in 
summary form: 


(a) Each test should be one that has been in common clinical use, 
either in the exact form used here or some modification. 

(b) Each test should be suitable in difficulty for the group to be tested; 
i.e., there should be a good range of possible scores and enough “‘ceiling”’ 
to prevent markedly skewed distributions. 

(c) The problem in each test must be essentially non-verbal. 

(d) The battery should be composed of tests whose nature is varied; 
i.e., not overloaded with any one type of test. 

(e) Directions and instructions must be perfectly explicit for both 
examiners and subjects. 

(f) All scoring must be strictly objective in nature. 

(g) Since the time for examination was limited to two hours, it was 
necessary to find tests that could be given in from 5-10 minutes each, 
thus keeping the number of variables as large as possible for the analysis. 
Some tests, therefore, had to be shortened; however, the essential problem 
in each was not changed by these procedures. 


As indicated in (g) above, some of the tests were not presented in 
their full length. Obviously this may have had an adverse influence on 
the reliabilities of these particular tests, but on the whole it seemed 
justifiable to abbreviate some tests to permit the use of a greater number 
of tests in‘the battery. It should be emphasized again that in no case 
was the essential nature of the test problem changed, the only disad- 
vantage being the shortening of the behavior sampling obtained. An 
outline of the tests used, their sources, and scoring methods utilized is 
given below: 


1. Healy PC II: Given and scored according to the standard fashion 
described by Healy (8). 

2. Memory-for-Designs: Given and scored as standardized by Cornell 
and Coxe (5). 

3. Visual Recognition: Designs used were those from Wooley (33: pp. 
147-151). Presented according to her directions; scored as number right 
divided by{time (R/T). 
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4. Knox cubes: The Pintner-Paterson series used here, with the score 
being number right in one trial of the series (22). 

5. Cube analysis: Counting the number of cubes seen in a drawing of 
a pile of cubes. The drawings used were H, G, J, 1, F, K, D, and L from 
Test 2 of Squires (25). Scored as R/T. 

6. R and L Hands: Thirty drawings of hands to be sorted into rights 
and lefts. Taken from Squires (25); scored as R/T. 

7. Dot estimation: Three sets of 8 cards each to be sorted into order 
from least to most dots. While used in several scales in various modifica- 
tions, these cards were designed especially for this investigation (9). 
Scored as R/T. 

8. Picture analysis: Taken from Squires (25); his two series were here 
telescoped into one series made of the more difficult ones from his tests: 
1A, 2B, 3B, 4B, 5B, 6B, 7B, 8B, 9A, 10A, 11B, and 12B. Scored as R/T. 

9. Ferguson formboards: Presented as described in Bronner et. al. (3). 
Scored by a new system now in process of standardization, which, while 
built on same principles of Meehan-Shimberg system cited by Bronner, 
allows more “ceiling” for adult subjects. Final score is thus a point score 
based on time and accuracy. 

10. Kohs block designs: Kohs (10) scale shortened to include II as 
practice and VI, IX, X, XV, and XVII as the test cards. Scored as total 
time taken for whole series. Kohs’ time limits were observed and 
“penalty” scores assigned to designs not completed within time limits. 

11. Digit-Symbol substitution: Presented as used in Cornell-Coxe scale 
(5), with a time limit of 90 seconds instead of 120. 

12. Minnesota spatial relations: Boards A and B presented as described 
by Paterson (21). Scored as time needed for the two boards. 

13. Minnesota placing test: Given according to Ziegler’s instructions 
(35) ; one practice trial allowed and then score was time on two test trials. 
14. Minnesota turning test: From Ziegler (35) and given as above. 

15. Painted cube: Taken from McFarlane (14). A model made of 27 
red cubes shown subject, who is then scored on number of seconds required 
to reproduce the model from the necessary loose cubes. 

16. Dearborn board 1C—first trial: Test presented as described in 
Bronner (3). For this variable we used the time on the first trial. 

17. Dearborn board 1C—best trial: The above board presented for 
three trials; this variable being the best time on any of the three. 

18. Feature-profile: Given in the standard fashion as in Arthur (1). 
Score being total time required to successfully complete. 


As will be seen in the outline above, five of the tests (3, 5, 6, 7, and 8) 
were scored by the formula R/T. These ratio scores of accuracy divided 
by speed were used because it seemed desirable to have a score represent- 
ing success dependent both upon the speed with which the subject worked 
and the skill with which he met the task. In all these tests the instruc- 
tions were worded so the subject was told to work as fast as possible and 
still try to get the problem correctly done. Support for this use of ratio 
scores is found in Kuhlmann’s (11) recent discussion. He argues that 
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the R/T formula really operates to weight speed and accuracy equally and 
is thus the fairest measure of performance. 


Notes on the statistical procedures 


To insure accuracy of the data to the fullest possible extent, punched- 
card techniques were used to permit use of Hollerith equipment in tabu- 
lating and setting up the necessary computations for securing product- 
moment correlations. The Toops gross score formula was used in the 
determination of the coefficients and the basic accuracy of the data 
verified by the S-check and S8*-check. 

Thurstone’s centroid method of factor analysis was used as set forth 
in Vectors of Mind (27). His method was followed explicitly, including 
the special provision that signs should be changed in the residual tables 
so the algebraic sum of each column became positive before the estimated 
communality was included. This provision insures bringing out the 
maximum loadings on each successive new factor. A crucial issue in fac- 
tor analysis is the determination of the number of significant factors that 
may be extracted from any given correlational matrix. No exponent of 
factorial methods has thus far advanced any criterion, based on irrefutable 
statistical grounds, that will tell one when to stop the factorial process. 
In the present investigation various criteria were used as suggested by 
Thurstone (30), Blakey’s revised form of this (2), Coombs (4), McNemar 
(15), and Mosier (19). These multiple criteria thus allowed one to reach 
a “composite judgment” by accepting the number of factors most com- 
monly suggested by the majority of the criteria. 

One point that has been stated repeatedly by Thurstone is the neces- 
sity for rotating factors yielded by his centroid method into simple struc- 
ture. Incidentally this point is the one most often violated by investiga- 
tors who have professed to have performed Thurstone factor analyses of 
their data. The method of rotation described in Vectors of Mind has 
been superseded by a newer and more economical technique proposed by 
Thurstone in 1938 (29). This method of extended vectors was utilized 
in this analysis. 


Presentation of Data 


The correlational matrix for the 18 test variables is shown in Table 1. 
It will be noted that most of the correlations are positive and, on the 
whole, decidedly small in absolute value. A frequency distribution of 
these 153 coefficients, made without regard to sign, reveals a mean value 
of .186 and a SD of .129. The standard error of a coefficient of this 
magnitude and based on 113 cases is .0908. 
The centroid analysis performed on the correlational data in Table 1 
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was carried to five factors. Table 2 presents the loadings of the various 
tests on these unrotated centroid factors: 














Table 2 
The Centroid Matrix Before Rotation: F, 
Factor 

No Name of Test I II Ill IV Vv h? 
1 gs SR 336 274 832 142 166 346 
2 Memory-designs....... 457 244 —.080 -—.110 .232 341 
3 Visual recog........... .235 173 .050 .228 —.096 149 
4 Knox cubes........... .275 154 .231 114 —.099 175 
5 Cube analysis......... 334 —.249 182 —.231 —.265 330 
6 R & L Hands......... 3200 —.206 —.206 176 —.216 .253 
7 Dot estimation. ....... 317 —.082 505 152 —.106 397 
Ss Picture analysis........ 400 —.139 082 —.209 .202 .271 
9 Ferguson boards. ...... .749 136 -—.125 -—-.195 —.114 .646 
10 Kohs designs.......... .788 .036 113 -.113 —.096 .657 
ll Digit-symbol.......... 457 —.431 .095 .059 107 419 
12 Minn. spatial.......... .661 191 -—.113 —.256 110 564 
13 Minn. placing......... 393 —.444 —.236 .096 .263 486 
14 Minn. turning......... 371 -—470 —.371 093 —.058 .508 
15 Painted cube.......... 464 77 =—.155 3=29—.2240 —-—.215 367 
16 Dearborn Ist.......... .349 309 —.104 .267 —.053 .272 
17 Dearborn best ......... 566 .295 —.249 .238 051 529 
18 Feature-profile......... 478 .094 094 —.052 136 .267 
Be ais vodanus 218 065 .048 032 025 388 





Application of the various criteria for completeness of factorization, 
as cited above, indicated that only three of these five factors were sig- 
nificant. Seven of the nine criteria used led to this conclusion, so the 
rotational process was thus performed on centroid factors I, II, and III 
(see Table 3). 


Interpretation 


The process of deriving significant meaning from a set of rotated 
factors requires more finesse than any other step in the investigation. By 
way of introduction to the interpretative process, it is well to set forth 
some of the “rules.”” One attempts to read meaning into a factor by 
comparing those variables high on that factor with those that are low, 
thus seeking for obvious differences between the two groups of tests. 
Hence, a factor may be described partially in terms of abilities it does 
not represent, as well as by the more straightforward way of telling what 
abilities it apparently does involve. The accepted rule is that a test 
must have a loading of at least .400 on a given factor to be considered 
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Table 3 
The Rotated Factor Matrix: F, 

No. Name of Test I, Il, III, h? 
J nl IE ea 328 389 —.150 .281 
2 Memory-designs.............. 496 .093 121 .261 
3 I Sve wie On Geeccsecue .264 117 —.011 .084 
4 re on are nang .230 .296 — .056 .144 
5 a 001 .360 .281 .208 
6 ass chess nenee .110 — .001 .404 175 
7 Dot estimation............... 021 599 012 359 
8 Picture analysis.............. .148 .282 285 183 
9 Ferguson boards.............. .627 .209 .380 581 

10 GD nines 9 want ¢0-02 520 453 367 .610 
11 I oie 5.0.05 se000e0 oe — .022 .376 .516 .408 
12 Es bs 00s Vu sacuecce .603 .168 .287 A74 
13 0 rr errr 013 .062 .636 409 
14 ee .016 — .060 .701 495 
15 eS a cca s > bss ome oe A474 .044 .205 .269 
16 NET EEE 476 .010 .026 .227 
17 Densmore beat... .. 52... ccecc,s .649 — .014 221 .470 
18 Feature-profile................ 359 .283 161 235 
EE ee 141 074 ll 327 

Factor Intercorrelations 

I- Il — .059 

I-III .001 

II-III .010 





significantly saturated with that factor; conversely, a loading must be 
around .100 or less to be considered “‘zero”’ or negligible. A further pre- 
caution is that to be adequately interpreted a factor must be over- 
determined, i.e., it must appear in several tests; Thurstone would require 
the factor to appear in significant saturation in at least three or four 
variables before it can be interpreted with confidence (28). 

A general comment or so concerning the success of rotation to simple 
structure in the present data should be pertinent here, because the correla- 
tion between the factors must be considered in their interpretation. If all 
factors are completely independent from each other, then they are or- 
thogonally related and the intercorrelations are all zero. The factor 
intercorrelations given in Table III show that for all practical purposes 
the three factors here have been rotated into almost perfect simple struc- 
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ture. This conclusion is supported, also, by the appearance of at least 
five high and five low tests on each factor column. A further desirable 
feature is that each test should approximate a zero loading on at least 
one factor. The present matrix does not meet that entirely, for there are 
three or four tests whose loadings are more than “negligible’”’ on all 
three factors. 


Factor I, 


The first factor is rather easily interpreted, as can be seen from the 
variables compared below: 


Factor I, 
High Low 
17. Dearborn best.............. 649 5. Cube analysis............. .001 
PO 627 7. Dot estimation............ 021 
12. Minn. spatial............... .603 11. Digit-symbol.............. — .022 
10. Kohs designs............... .520 13. Minn. placing............. .013 
2. Memory-designs............ 496 14. Minn. turning............. .016 
GM MG 6.s ccc cncvevcc 476 
15. Painted cube............... A74 


It will be seen that the three highest loadings are on formboards where 
the perception of space relations and not manipulative dexterity pays 
the premium in good scores. Tests 10 and 15 require the use of cubes 
that are all the same size and shape, but their placement in relation to 
other cubes in the given space is the essence of the tasks. Shapes and 
spatial relations are also involved in 2 and 16; speed of movement is 
definitely not a factor in 2, and 16 fails to show any significant loading on 
the speed factor to be discussed later. On the zero side of the ledger we 
find that 13 and 14 are boards in which all the blocks fit in any hole and 
the task is to insert them as rapidly as possible, with no necessity for 
choosing which block fits a given hole. In 11, the digit-symbol test, the 
shapes involved are very simple and familiar, so the task boils down into 
a learning procedure plus a premium on mechanical dexterity or speed in 
use of a pencil. The dots test, 7, certainly does not involve shape. The 
last zero loading, on 5, comes on a test that Thurstone’s battery (30) 
would suggest should be a spatial test, for his block-counting test was 
high on that ability. However, there the test was more difficult and was 
scored as R (the number right), whereas in the present test the R/T 
formula was used, and since most subjects got nearly all the items right, 
the T part of the formula becomes the differentiating factor. It is prob- 
ably for this reason that here the test shows up among the speed tests 
rather than with the spatial group. In summary, then, Factor I, can 
be safely described as the ability to visualize spatial relationships, with 
speed of movement or manual dexterity playing an almost vanishing role. 
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This would seem to be the same factor that has reappeared time and 
again in the earlier studies cited, where it was variously called the spatial 
or visualizing factor. 


Factor III, 

Let us turn next to a discussion of the third factor, for it is easier to 
interpret than II and, too, Table III shows that III, ranks second in the 
amount of variance accounted for. The following tests are the ones to 
be considered here: 


Factor III, 
High Low 

14. Minn. turnming.............. .701 S38 tg re —.150 
13. Minn. placing.............. .636 MIR we dewsnos< see's — .056 
11. Digit-symbol............... 516 EE —.011 
Gi. ae ees 9 Fee ws Cis 404 7. Dot estimation............ .012 
O: Pe cc Se co acececs .380 16, Dearborn 1a. ...:. 2.006.008 .026 
ee Ee eee 367 


The three highest loadings here are on the tests which, as we have 
already pointed out, place the primary emphasis on speed of movement 
or manipulative dexterity. Ziegler, in his manual (35), makes the claim 
that the Minnesota placing and turning tests are almost pure tests of 
speed. The present analysis would seem to substantiate that claim, for 
they are high on this speed factor and negligible on the other two factors. 
Tests 9 and 10, although having their heaviest saturations on Factor I,, 
can certainly be viewed as tasks in which speed would be a distinct asset. 
The question may be raised why 6 is the only R/T variable to show up 
significantly on this factor. It is the only one of these reaching .400, 
but 5 and 8 are nearer significance than negligibility. Excellent evidence 
for considering this a speed factor will be pointed out on the negative side 
of the picture. 

Foremost in importance from this negative angle we find the Healy 
test, 1, with a loading on the minus side that is sufficiently large to escape 
the label of “negligible.” It is quite within reason that this test, in which 
time is ample for all adult subjects, is one in which speed would be 
actually detrimental, i.e., the more quickly one rushes through his selection 
of answers the more prone he might be to overlook the clue that would 
give him the lead to more appropriate choices. Hanfmann’s analysis (6) 
would corroborate this line of reasoning. To a lesser degree one can also 
see that speed might hurt one’s chance of a high score in the Knox cubes. 
The final and clinching differential may be found in the case of the two 
Dearborn variables. On the first trial this test seems to be almost ex- 
clusively a “‘puzzle”’ in which the task is to select the correct spatial rela- 
tionships, with speed of movement largely inconsequential. The best trial 
of the three, though, is apparently the product of rapid learning and the 
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speed of execution now becomes much more of a functional factor. This 
reasoning is borne out here in their respective loadings on Factors I, 
and III,. We must conclude, then, that Factor III, is almost certainly 
one dealing with speed of movement or manipulative dexterity. 


Factor II, 


There remains the necessity to attempt an interpretation of Factor II,. 
This has not proved as easy or as satisfactory as the task of describing the 
other two factors. The differentiating tests to be examined are: 


Factor ITI, 
High Low 

7. Dot estimation............. 599 6. R & L Hands............. —.001 

Bs TEs 0 0.0 cities rn vsieds 453 16. Dearborn Ist.............. .010 
SS 9) eee .389 17. Dearborn best............. —.014 

11. Digit-symbol............... 376 15. Painted cube.............. 044 
5. Cube analysis.............. .360 14. Minn. turning............. — .060 

13. Minn. placing............. 062 


Two points should be emphasized at the outset in interpreting this 
factor: (a) It is not sufficiently overdetermined, for only two tests reach the 
desired saturation of as high as .400, and (b). This factor accounts for 
only 7 per cent of the variance of the battery as against 14 and 11 per 
cent for the other two factors. These considerations indicate Factor Il, 
is the least significant and most unreliably determined of the three factors 
found. With these limitations in mind, the following hypothesis as to its 
nature is suggested as being within the realm of plausibility. 

In one sense, this factor can probably be most safely described by 
showing what it does not involve. The zero weights for 15, 16, and 17 (all 
high on I,) indicate that I], does not require the spatial ability. Like- 
wise, the zero loadings for 13, 14, and 6 (all high on III.) show it is not 
dependent upon speed. The mutual orthogonality of the three primary 
axes is, of course, discernible in these trends. 

On the positive side of the story one finds it difficult to seize upon any 
essential ability that all the tests high in I], seem to require in common. 
In what amounts to frank speculation the following tentative explanation 
is suggested: Factor II, involves the ability to grasp readily relationships 
or configurations (Gestalten, if you will), 7.e., the ability to comprehend 
relationships of parts and wholes other than those of a spatial nature. 
Test 7, with zero loadings on the other two factors, would involve the 
ability to perceive and judge such relationships between groups of parts 
considered together as wholes. Test 10, high on all three factors, could 
very well require comprehension of the task as a whole as well as the 
spatial perception of the individual cubes. The Healy test, 1, necessitates 
relating clues from one picture to events seen in some of the other pictures. 
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Whether or not this ability is related to what Thurstone has termed 
the perceptual factor is hard to determine. Test 8 in the present battery, 
picture analysis, is somewhat similar in nature to Thurstone’s test of 
identical forms, whose highest loading is on the perceptual factor. Here 
test 8 has a weight of .282 on II,, which might be taken as an indication of 
some degree of comparability between II, and the perceptual factor. It 
should be noted here, though, that in his latest report (32) Thurstone 
classifies the perceptual factor among those that usually appear, but 
cannot yet be identified with confidence. 


Summary 


Eighteen test variables were obtained from a battery of clinical per- 
formance tests given to a homogeneous group of 113 college men. When 
subjected to Thurstone’s method of centroid analysis three significant 
factors were extracted. The significance of these factors was judged by 
the application of nine different criteria that have been proposed as tests 
of completeness of factorization. By use of Thurstone’s method of 
extended vectors, these three factors were rotated into a satisfactory ap- 
proximation of simple structure, all three axes being mutually orthogonal. 

Interpretation of these factors led to satisfactory identification of two, 
while a tentative explanation of the third is suggested. The two factors 
that could be readily described are: (a) Spatial—the ability to visualize 
spatial shapes and relationships, and (b) Speed—the ability for rapid 
movement and dexterity of manipulation. The remaining factor was not 
adequately determined, but there were indications it require the ability 
to readily grasp non-spatial Gestalt relationships. 

A final comment is that much of the ability required by this battery 
must lie within the wnique factor space rather than in the common factor 
space. The battery has only 33 per cent of its variance accounted for 
by these three factors, leaving two-thirds of the variance to be explained 
on the basis of unique and unrelated components. Further, the ease with 
which the matrix was placed in satisfactory simple structure and mutual 
orthogonality indicates no single general ability (“g’’) can be found 
common to all the tests. 
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Distribution of Scores from Revisions of Army Alpha 


George K. Bennett 
The Psychological Corporation 


A recent article by Hay and Blakemore! states that the Nebraska 
Revision of Alpha, when administered to 71 bank clerks, produced a 
“strong negative skew” with scores “bunched toward the ceiling of 
the tests.” 

If this were a characteristic of all revisions of Alpha, the test would 
obviously be unsuited for superior candidates. The recent standardiza- 
tion of Modified Alpha Form 9? upon some thousands of high school 
students appears to indicate that the Wells’ revisions of Alpha do not 
have this characteristic. While these results apply solely to Modified 
Alpha Form 9, the mean and standard deviation for this form have been 
experimentally found to be almost identical with the statistics for Revised 











Table 1 
Percentile Equivalents of Modified Alpha Form 9 Scores for High School Seniors 
Total Score 
Percentile 
Boys Girls Equivalents 
184 177 99 
165 161 95 
155 151 90 
144 140 80 
134 133 70 
128 125 60 
121 118 50 
114 111 40 
106 103 30 
97 95 20 
84 84 10 
75 76 5 
57 61 1 
No. of cases.... 794 963 Maximum score........ 212 
Score mean..... 120.01 117.52 
Score sigma.... . 26.00 26.61 





1 Hay, E. N., and Blakemore, A. M. Comparison of Otis and Alpha Test scores 
made by bank clerks. J. appl. Psychol. 26, 6, 850-851. 

? Wells, F. L., Manual for Modified Alpha Examination Form 9, New York: The 
Psychological Corporation, 1943. 
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Alpha Form 8, which in turn does not depart significantly from Forms 
5 and 7. 

In Table 1 are given the percentile equivalents for fairly large numbers 
of high school seniors of both sexes. These subjects were drawn from 23 
secondary schools in all areas of the United States. Since for both boys 
and girls the mean is well over three standard deviations below the 
maximum possible score, such “bunching,” as reported by Hay and 
Blakemore, would be unlikely to occur even upon a highly selected sub- 
sample of this population. 
Received February 23, 1943. 











The Adaptability Test: A Fifteen Minute Mental Alertness 
Test for Use in Personnel Allocation 


Joseph Tiffin and C. H. Lawshe, Jr. 
Division of Education and Applied Psychology, Purdue University 


Purpose. The Adaptability Test! measures mental adaptability or 
mental alertness. The test has a fifteen minute time limit and is useful 
in helping to identify persons who should be placed on jobs that require 
rapid learning and/or the development of independent judgment. It 
also aids in identifying persons who do not readily adapt to new situations 
but who would be satisfactory (and often superior) employees on simple, 
routine jobs such as packing, inspecting, assembling, or in the operation 
of simple, repetetive machines. 


Construction 


Original item selection. One hundred and twenty test items were pre- 
pared so as to sample those types of items which have been used in mental 
ability tests. The original sampling included items similar to those iden- 
tified by Thurstone’s analysis of primary mental abilities which could be 
practically included in a test of this length and type. The items were 
then classified into two groups, the sixty which the authors judged to be 
more difficult comprising one group, and the sixty judged to be least 
difficult in the other group. 

Preliminary administrations. Following this division, the items 
judged least difficult were administered on a no time limit basis to a group 
of Navy trainees assigned to receive instruction prior to being rated as 
electrician mates. The 25 per cent of the papers with the highest scores 
and the 25 per cent of the papers with the lowest scores were segregated 
and discrimination values for each item were computed by the Kelley 
technique as described by Lawshe.? All items which did not yield 
D-values of .5 or better were discarded. 

Likewise, the items judged to be more difficult were administered to 
a group of engineering juniors and seniors. The same analysis technique 
was employed and all items having D-values below .5 were discarded. 

Eighty items selected by means of these preliminary administrations 
were combined, arranged in approximate order of difficulty, and admin- 
istered to a second group of Navy trainees on a no time limit basis. 


1 Published by Science Research Associates, 1700 Prairie Avenue, Chicago, Illinois. 


? Lawshe, C. H., Jr., A nomograph for estimating the discrimination value of test 
items. J. appl. Psychol., 1942, 26, 846-849. 
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Again the same item analysis technique was employed and the seventy 
most discriminating items were retained, none of which had D-values 
less than .5. Thirty-five items each were allocated to Forms A and B 
by matching both on the basis of item difficulty and D-value. In each 
form, items were arranged in increasing order of difficulty. Each form 
has an average D-value of 1.0 for its items. Forms A and B were then 
administered to a third group of Navy trainees and the time limit of 15 
minutes was established because this was the shortest time that resulted 
in satisfactory reliability. 

The introductory paragraph on the title page of the test was developed 
through consultation with industrial personnel men and actual try-out 
with applicants and employees. The introduction which is reproduced 
below makes no mention of mental ability, intelligence, or the I.Q. and 
represents an effort to put at ease the individual taking the test: 


Some jobs require figuring—such as adding, subtracting, multiplying, 
and dividing—while others require writing reports or answering letters, 
and still other jobs can be done well by people who are not particularly 
apt with figures or words. This test will help in determining how well 
you can handle jobs that require these abilities. 

Do as well as you can on this test, but do not worry about it. Re- 
member that you may be well qualified for certain jobs that require 
training or skills different from those covered in this test. 


Comparability and Reliability of Forms 


The matching procedure discussed above resulted in a high degree of 
comparability between Forms A and B. Not only are the two forms 
comparable on the basis of all thirty-five items in each form but also on 
the basis of identical portions of the forms as is shown in Table 1. 


Table 1 
Average D-Values and Difficulty Levels of Forms A and B by Parts 





Form A Form B 








Average Average Average Average 
D-value Percent D-value Percent 





First 10 items , 86 1.0 87 
First 20 items : 75 1.0 75 
First 30 items é 64 1.0 64 
Total 35 items . 55 1.0 55 





In this table are presented the average D-values for the first ten, the 
first twenty, the first thirty, and all thirty-five items in each test. Also 
the percentage of trainees passing each item were used in computing the 
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average percentage for the first ten, the first twenty, the first thirty, and 
all thirty-five items in each form. 

Both Forms A and B of the Adaptability Test were administered to 
a total of 942 industrial applicants, employees, trainees, and university 
students. In each group, Form A was given first to approximately half 
the persons tested, and Form B was given first to the remaining half. 
Means and standard deviations for both forms of the test as well as for 
the first and second form given are presented in Table 2 and indicate the 
comparability of the two forms. 








Table 2 
Comparability of Forms A and B of the Adaptability Test 
First Test Taken Second Test Taken 
Form A Form B Form A Form B 





N Mean 8.D. N Mean 8.D. N Mean 8.D. N Mean S.D. 





Female appli- 

cants....... 377 106 5.35 235 10.5 5.59 235 12.0 5.30 377 11.1 5.53 
Navy Trainees 107 19.0 4.58 94 17.9 5.14 4 18.6 5.02 107 17.7 5.14 
Clerical 

workers.... 42 18.7 4.01 44 206 650 44 21.2 632 42 18.7 7.01 
Purdue seniors 22 27.6 2.63 21 244 5.19 21 26.7 486 22 26.8 3.05 
All groups.... 548 13.5 6.62 394 14.1 7.26 304 15.4 6.95 548 13.6 7.02 





Reliability. Estimates of the reliability of the test were obtained by 
the split-half and the test-retest procedures. Coefficients of correlation 
found by the split-half method (odd versus even) and stepped up by 
means of the Spearman-Brown formula are presented in Table 3. The 


Table 3 
Reliability of the Adaptability Test as Determined by Split Half Method 





Form A Form B 
N S8.D. e Pia N 8D. Bikes 








Female applicants......... 617 536 .84 1.44 611 536 81 1.62 
Navy Trainees............ 211 480 .80 1.42 212 5.19 .80 1.54 
Clerical workers........... 86 546 .76 1.76 86 683 .83 1.88 
Purdue Seniors............ 438 3.02 .738 1.87 48 441 .60 1.88 
DL: ccthveveco¥xkabe 957 683 .90- 1.48 952 7.10 88 1.67 





r’s for the various groups on the two forms ranged from .60 to .90. For 
all groups tested the r’s are .90:for Form A and .88 for Form B. 
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Correlations obtained by the test-retest method are presented in 
Table 4 and represent two approaches, one in which scores on Form A 
were correlated with scores on Form B regardless of the order in which the 
forms were taken, and the other in which scores on the first test taken 
were correlated with scores on the second test taken regardless of form. 
The correlations obtained by the former method range from .79 to .93 
and those obtained by the latter method range from .78 to .89. For the 
entire population both methods yielded an r of .89. 














Table 4 
Reliability of the Adaptability Test as Determined by Test-Retest Method 
Form A vs. Form B First Test vs. Second Test 
Ave. Ave. 
N S.D. r P.E.n. N 8S.D. r PEs. 
Female Applicants......... 612 5.46 81 1.58 612 546 82 1.33 
Navy Trainees............ 201 5.00 .79 1.52 201 5.31 84 1.21 
Clerical Workers.......... 86 6.15 .89 1.37 86 6.15 .89 1.38 
Purdue Seniors............ 43 4.17 93 .73 438 445 .78 1.41 
Be oa de din sccnses 942 6.97 .89 1.55 942 7.02 .89 1.57 





Table 3 and 4 also show probable error of measurement values com- 
puted from the average standard deviations and the correlations of the 
various groups. These values presented in the two tables range from .73 
to 1.88 for the several groups. Where the entire populations are con- 
sidered, the four values range from 1.48 to 1.67. If a value of 1.5 is 
accepted as the probable error of measurement, then it can be expected 
that 50 percent of a given individual’s obtained scores will lie within a 
range of 1.5 points (plus or minus) of his true score. 


Validity 


The validity of the Adaptability Test for use in certain employee and 
trainee allocation situations is shown by the relationship between scores 
on the test and (1) success on the job of employees in an industrial clerical 
department and (2) success of Naval recruits assigned to a training pro- 
gram for ship electricians. 

Clerical employees. The effectiveness of this test in identifying persons 
who have the greatest ability to adapt themselves quickly and effectively 
to job situations that are not strictly routine in nature is shown by an 
experiment carried on with the clerical force of a paper mill. The 88 
employees in this department were divided by supervisors into two 
groups, A and B. The A employees were those who had demonstrated 
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the greatest adaptability, alertness, and general worth to the company. 
There were 50 A employees in the department. The B employees were 
those who had trouble in adapting themselves to new jobs, and who 
required much specific instruction on each new type of work. There 
were 38 B employees in the department. 

Forms A and B of the Adaptability Test were each given to all 88 
employees (50 A employees and 38 B employees). Approximately half 
the employees took Form A of the test first, and followed this with Form 
B. The remainder of the employees took Form B first, and followed this 
with Form A. The average Adaptability Test scores for both the A and 
B employees groups are summarized in Table 5. 


Table 5 


Average Adaptability Test Scores of the High Rated Clerical Employees (Group A) 
and the Low Rated Clerical Employees (Group B) 





Group A (50 Group B (38 __sODifference 
High Rated Low Rated between A Critical 





Employees) Employees) and B Ratio 

Average Score on Form A of the 

Adaptability Test........... 21.7 18.4 3.3 3.2 
Average Score on Form B of the 

Adaptability Test........... 22.2 17.2 5.0 4.5 
Average Score on First Form of 

the Test Given............. 21.7 17.2 4.5 4.6 
Average Score on Second Form 

of the Test Given........... 23.6 18.6 5.0 4.8 





Table 5 shows that the A employees made a significantly higher score 
than the B employees on both forms of the Adaptability Test, since all 
critical ratios are greater than 3.0. This means that if two employees 
have different scores on this test, the one with the higher score is more 
likely to be an A employee than the one with the lower score. 

A graph which expresses these results in another way is shown in 
Figure 1. On the left side of this figure is shown the percent of A em- 
ployees obtained when no test is used, but only standard previous 
employment procedures are followed. Under these circumstances, 50 
of the 38 employees, or 57 per cent, were in the A group. If a minimum 
Adaptability Test score of 15 had been required, 64 per cent of the em- 
ployees selected for this work would have been in the A group. With 
successively increasing minimum Adaptability Test scores required, suc- 
cessively higher percentages of A employees would be placed on the job. 
If a minimum test score of 30.on this test were required, 88 per cent of 
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the employees would be in the A group. The increase from 57 per cent 
A employees (obtained without the test) to 88 per cent A employees 
(obtained with a minimum score of 30 required) represents an increase 
of 31 per cent from a base of 57 per cent, or an increase of 54 per cent in 
the actual number of A employees on the job. 
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Fie. 1. Graph showing the percentages of clerical employees who were in the “‘A”’ 
rated (good) group when no test was employed and the percentage who would have been 
in the “‘A” group had successively higher minimum scores on the Adaptability Test 
been used to supplement existing hiring procedures. 


The above results should not be interpreted to mean that this test (or 
any test) should replace other employment procedures. The results 
shown above indicate the effectiveness of the total employment procedure 
when it includes the use of the Adaptability Test. 

The relation between Adaptability Test scores and success of the 
employees in this department may also be expressed in terms of the 
Bi-serial coefficients of correlation between test scores and rated success 
on the job. The correlations are summarized in Table 6. 

Table 6 shows that both Forms A and B of the Adaptability Test, as 
well as the first test given, (regardless of form) relate significantly to rated 
success on the job. 

Navy electrical trainees. Further evidence of the effectiveness of the 
Adaptability Test in identifying persons who are apt to learn best was 
obtained when the test was used in conjunction with a Navy training 
program for electricians. Sailors are inducted into this program after 
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Table 6 s¢ 
Bi-Serial Coefficients of Correlation between Adaptability Test Scores and 
Success of Clerical Employees on the Job pi 
el 
Bi-serial correlation with h 
success on the job P.E.- “ 
MR an he Oe a 2 Cet cha vol ve ecceabed 40 .08 
Rs ee see uabbeees cea 56 .07 
NTS 5.5 Vato Mhel sk Ghu cde oeb deans .59 07 
PUI « occ dhaka» <bceedhaeceanbss- 65 .06 





about four weeks of basic naval training. Prior to entering the program, 
200 men already allocated to the school were given both forms of the 
Adaptability Test, half receiving Form A first, and half receiving Form B 
first. Grades made in the school during the first four weeks of the 
training program were analyzed and compared with pre-training test 
scores. Grades issued in the school ranged from 4.0 down, with 2.5 
established as “passing.’”’ Grades received from five instructors each 
week for four weeks were averaged to obtain the grade point average 
which was used as a measure of the success of the trainees. On the basis 
of these grades, the trainees were classified into three groups: the highest 
25 per cent, middle 50 per cent and lowest 25 per cent. Average test 
scores of these three groups were computed and are tabulated in Table 7. 


Table 7 


Average Adaptabiltiy Test Scores of 200 Navy Electrical Trainees, Divided into 
Highest 25%, Middle 50%, and Lowest 25% According to Average Grades 
Received During Four Weeks of Training 





Highest 25% Middle 50% Lowest 25% 





Mean Test Score............. 20.6 15.8 13.4 
RS, SER se ere ee .39 13 48 
Standard Deviation........... 4,24 3.60 5.30 








The difference of 7.2 between the mean of the high group and the mean 
of the low group is significant in that it is 12.4 times the probable error of 
the difference. 

These results are presented graphically in Figure 2. On the left side 
of this figure are plotted the percentages of trainees who equal or exceed 
grade point averages of 3.0 and 3.5, respectively, when no test is used. 
In this case, 71 per cent of the trainees equal or exceed an average of 3.0 
and 22 per cent equal or exceed 3.5. If a minimum Adaptability Test 
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score of 15 had been established for entrance to the training program, 
85 per cent of the trainees would have equalled or exceeded 3.0, and 27 
per cent would have equalled or exceed 3.5. Likewise, had a minimum 
entrance score of 20 been established, 94 per cent of the group would 
have equalled or exceeded 3.0 and 57 per cent would have equalled or 
exceeded 3.5. Grades made during the first seven weeks of training 
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Fig. 2. Graph showing the percentage of Navy electrical trainees who equaled 
or exceeded indicated grade point averages when existing allocation techniques are 
employed and the percentages attaining or exceeding these grade point averages had 
successively higher minimum scores on the Adaptability Test been used to supplement 
existing procedures. 


correlated with initial scores on the adaptability test yield on r of .64. 
The successive rise of the percentages of trainees equalling or exceeding 
specific performance levels with successively higher minimum test scores 
together with this correlation may be interpreted to mean that this test 
will effectively supplement allocation techniques now being used to select 
men for this particular training program. 


Administration, Scoring, and Interpretation 


The Adaptability Test is self-administering. The person or persons 
to be tested should be given plenty of time to read carefully the first page 
of the test. In testing groups, it is advisable for the examiner to read 
the first page slowly while the persons taking the test follow the reading 
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on their copies of the test. After the introductory page has been read, 
the examiner should instruct the persons being tested to open their test 
booklets and begin the test. At the end of exactly 15 minutes, the examiner 
should say: “‘Stop. The time is up on this test. Please close your booklets.” 
As the introductory explanation is identical for Forms A and B of the 
test, the two forms can be given simultaneously to a group, part of which 
is taking Form A and part Form B. 

Form A and B are equivalent in difficulty, time of administration, and 
interpretation. The two forms have been prepared to permit retesting 
of doubtful cases or of any persons who may have had access to one of 
the forms for previous study. The score on either form is the number of 
items answered correctly. 

Interpretation. While high scores on this test are positively related 
to success on jobs that require mental alertness or ability to follow in- 
structions quickly and accurately, it has long been observed that persons 
making low scores on tests of this type are often more successful on 
simple, routine manipulative jobs than are persons who score high. This 
relationship has been found in studies such as those of Bills,? Pond and 
Bills,‘ and Tiffin and Greenly.® 

The conclusions reached in these investigations were substantiated in 
part by administering the Adaptability Test to a force of 28 female 
employees on a simple repetitive inspection job in a paper wrapping 
department. These operators were divided according to the super- 
visor’s ratings of job performance into an A (or superior) group and a B 
(or inferior) group. The average Adaptability Test scores of both groups 
are shown in Table 8. 


Table 8 


Average Adaptability Test Scores of High Rated Inspectors of Simple Material 
(Group A) and Low Rated Inspectors (Group B) 








Group A Group B 
(14 High Rated (14 Low Rated 
Inspectors) Inspectors) 
Average Score on Form A of Adaptability 
p rere ee eee Ce ey ya 10.9 11.5 
Average Score on Form B of Adaptability 
Das a & bade en Chee nes Ceen babeaks 11.8 12.5 





* Bills, M. A. Relation of mental alertness test scores to positions and permanency 
in company. J. appl. Psychol., 1923, 7, 154-156. 

* Pond, Millicent, & Bills, M. A. Intelligence and clerical jobs. Two studies of 
relation of test score to job held. Person. J., 1933, 12, 41-56. 

‘ Tiffin, Joseph, & Greenly, R. J. Employee selection tests for electrical fixture 
assemblers and radio assemblers. J. appl. Psychol., 1939, 23, 240-263. 
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Score on 


Adaptability 


Test (Form 
A or B) 


Recommendations 


Type Jobs* 





30-35 


Should be placed on jobs where there is a real op- 
portunity for the development of originality, inde- 
pendent thinking, and critical judgment. Persons 
in this bracket are usually very superior on jobs in- 
volving verbal or numerical facility. They will 
learn simple jobs with great rapidity, but will soon 
tire of such jobs and find the “mind wandering” 
for new areas to conquer. 


Potential managers, 
executives or junior 
executives, private 
secretaries. 





Should be placed on jobs that require the develop- 
ment of some independent judgment and the oppor- 
tunity to shift the type of work done from time to 
time. In general, persons in this bracket learn 
quickly, and easily understand instructions. Some 
of them, particularly those whose scores are close to 
29, have the ability to become as versatile em- 
ployees as those whose scores are 30 or above. 


Office clerks, sales- 
men, apprentices 
and tradesmen, 
technicians, super- 
visors. 





Should be placed on jobs that require considerably 
less independent judgment, adaptability to new 
situations, and shifting of processes from day to 
day. This category includes the average or ‘run 
of the mill” persons. They are neither outstand- 
ingly high nor outstandingly low in adaptability or 
mentalalertness. They are more adaptable to new 
situations than persons scoring in the two lower 
brackets on this test. 


Set-up men, ma- 
chine specialists, 
non-automatic ma- 
chine operators, 
store salespeople, 
store keepers, stand- 
ards checkers. 





Should be placed on jobs that vary only slightly 
from time to time, and at no time call for exercise of 
independent judgment. Persons in this bracket 
are usually below average in ability to understand 
directions. Job trainers and supervisors will find 
it necessary to teach the job in a step by step 
manner. 


Packers, assemblers, 
inspectors, operators 
of simple machines 
which require only 
routine adjustment. 





Should be placed on very simple, repetitive jobs. 
Persons in this category are at the bottom so far as 
verbal tasks are concerned. They usually have a 
definite ceiling on their ability to adapt to jobs 
except those of a simple repetitive nature. While 
they usually require careful supervision during the 
learning period, they often become quite expert 
and self-sufficient on simple jobs where the job 


does not change. 


Simple packers, sim- 
ple assemblers, fixed 
guage inspectors, 

operators of ma- 
chines that require 
practically no ad- 
justment or specific 
care by the operator. 





* These jobs are presented only as a rough indication of possible employee allocation. 


Job descriptions and job demands vary so greatly among companies that it is impossible 
to give a universally applicable list of typical jobs for the different Adaptability Test 
score brackets. 
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On both forms of the Adaptability Test the better inspectors made a 
slightly lower average score than the poorer inspectors. The results 
summarized in Table 8 are not sufficiently conclusive, by themselves, to 
prove that persons scoring low on the Adaptability Test will necessarily 
be better employees on simple, routine jobs than persons scoring high. 
However, these results coupled with the findings of Pond, Pond and Bills, 
and Tiffin and Greenly mentioned previously strongly suggest that 
persons with low scores on this test are probably not poorer than average 
on such kinds of work and, in many cases, may in the long run be superior 
employees on simple routine jobs. 

Recommendations.—The following recommendations are used in hiring 
in one plant using the test and are made in the light of the several inves- 
tigations on this subject. They are based on norms obtained from 
administering the Adaptability Test (both Forms A and B) to 1381 indus- 
trial employees, applicants, trainees, and students. These recommenda- 
tions are intended to be purely suggestive. Not all persons who score high 


Table 9 
Norms for the Adaptability Test 
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on the Adaptability Test will succeed on jobs that involve the exercise of 
responsible judgment because a number of such persons will be lacking in 
other personal characteristics necessary for success. Also there is not a 
sharp dividing line between the categories indicated below. They shade 
gradually from one to another. 


Norms 


In addition to the verbal descriptions of various levels of performance 
on the test presented in the recommendations, norms for a number of 
different groups are presented in Table 9. Since some groups score 
slightly higher when a second form of the test is given, norms on both the 
first test given and the second test given are shown. As will be noted in 
Table 9, groups differ rather markedly in their performance on the test. 
For example, on the first administration of the test, ten per cent of the 
Navy trainees scored 24 or above, ten per cent of the female applicants 
score 18 or above, and ten per cent of the clerical workers scored 26 or 
above. Users of the test will probably want to set up norms for their 
own situation to supplement the norms presented in Table 9. 








Extension of The Minnesota Rate of Manipulation Test 


Clifford E. Jurgensen 
Kimberly-Clark Corporation, Neenah, Wisconsin 


This report concerns the use of the Minnesota Rate of Manipulation 
Test in a corporation which manufactures pulp, paper, and paper special- 
ties. The test form used was the Ziegler revision containing sixty holes, 
fifteen holes in each of four rows. “It measures the speed with which a 
subject can place relatively large circular blocks in holes in a board. 
Performance is not dependent upon judgment of differences in size and 
shape, nor upon great precision in eye-hand coordination. Success is 
dependent upon mere speed of gross hand and arm movements. Pre- 
sumably, analogous situations in industry would be packing, wrapping, 
etc.” (1). 

In standardized administration, “hand speed” is measured by the 
time required to place the blocks in the holes using the right (or preferred) 
hand. “Finger speed” is measured by the time required to turn the 
blocks over, using both hands. A practice trial is followed by four times 
trials, and the score is the total number of seconds required in four trials. 

The test procedure discussed here extended Ziegler’s procedure to 
give seven additional speed measures. The first three of these were com- 
parable to Ziegler’s Right Hand Placement except that they required the 
use of (1) left hand, (2) both hands moving simultaneously, and (3) both 
hands moving alternately. The other four additional measures were also 
similar to Ziegler’s Placement Test except that the subject was required 
to turn the blocks as well as place them in the holes. This was done in 
the following ways: (1) right hand, (2) left hand, (3) both hands simul- 
taneously, and (4) both hands alternately. In each of the nine pro- 
cedures, the subject was permitted a practice trial and two timed trials. 
The score was the total number of seconds required for the two timed 
trials. 

The order of administering the test parts and symbols which will 
hereafter be used to identify each test procedure are as follows: 


1. Place, right hand. (PR) 
2. Place, left hand. (PL) 
3. Place, both hands simultaneously. (PS) 
4. Place, both hands alternately. (PA) 
5. Turn. (T) 
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6. Turn and place, right hand. (TPR) 

7. Turn and place, left hand. (TPL) 

8. Turn and place, both hands simultaneously. (TPS) 
9. Turn and place, both hands alternately. (TPA) 


Instructions given the subject for test parts PR and T were essentially 
those given in the manual of directions accompanying Ziegler’s test form. 
Instructions for additional procedures were similarly worded, differing 
only insofar as necessary to explain movements required in the various 
test parts. All instructions were clarified by the test administrator by 
means of a demonstration showing how the blocks should be handled, the 
order of picking them up, and the order of placing them in the holes of 
the board. Demonstrations were limited to correctly placing eight blocks 
in test parts requiring the handling of one block at a time (PR, PL, T, 
TPR, and TPL) and twelve blocks in test parts requiring the handling 
of two blocks at a time (PS, PA, TPS, and TPA). The practice trial 
followed the demonstration, the applicant continuing the procedure at 
the point where the demonstration ended. The practice trial was used 
to further clarify instructions where necessary. 

The group tested was composed of men hired as converting machine 
operators in a paper mill. The work requires occasional machine ad- 
justments, but consists chiefly of removing a specified number of facial 
tissues from the machine, raising the top tissue in order to insert adver- 
tising material, and placing the package of tissues in a conveyor. 

All subjects were high school graduates. Ages ranged from 18 to 31, 
with a mean of 22.1 and a sigma of 3.6 years. All subjects professed to 
be right-handed. The test was administered! after it had been decided 
to hire the applicant, but previous to his notification of this fact. Test 
results were sent directly to the author in the corporation’s Home Office. 
No mill supervisor was acquainted with test results of any individual. 
Results lay dormant for several months until sufficient data had been 
accumulated to warrant validation of the test. 

Ratings were used for criteria of job success. The name of each ratee 
was typed on an index card, and raters were asked to sort the cards into 
three piles on the basis of speed of work, namely: fast worker, average 
speed worker, and slow worker. The middle group (average speed 
worker) was then redivided into average plus, average, and average minus. 
Numerical values from one to five were assigned the five groups. Inde- 
pendent ratings were secured from three supervisors. Ratings were con- 

1 The author is indebted to Mr. Harry D. Gates, Manager of the Neenah-Menasha 


(Wis.) office of the United States Employment Service, for that agency’s administration 
of the tests upon which these data are based. 
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verted to T scores and the sum of the three T score ratings used as the 
criterion of job success. 

Correlations between the ratings of each supervisor with those of the 
others are: AB, .426; AC, .555; BC, .518. The reliability of the sum of 
the ratings as estimated by the Spearman-Brown prophecy formula is 
.749, 

Validity coefficients based on a group of 60 employees are: 


PR 325 PS 255 TPL .350 
PL .029 T 455 TPS 333 
PS .108 TPR 572 TPR .360 


The Wherry-Doolittle Test Selection Method was used to select and 
weight the test procedures which would give the optimum diagnostic 
battery in the present situation. This method selects tests in the order 
of their contribution to the multiple coefficient, and gives the maximum 
correlation after a correction has been made for the chance error which 
accompanies the addition of each test to the battery (4). 

A maximum multiple correlation of .660 was obtained from three of 
the test parts. In the order of their contribution to the multiple they are: 
TPR, T, and PL. This compares with a multiple R of .461 resulting 
from the administration of the test in the usual ways; namely, PR and T. 

Test scores were available for 212 employees, 60 of whom were in the 
criterion group. The additional 152 employees were comparable in all 
ascertainable ways with the criterion group, but because of a change in 
promotion policies were never placed on the job for which the test was 
being validated. As has frequently been pointed out (3) the size of a 
multiple correlation depends on the correlation of each test with the 
criterion and the correlation of each test with each of the others. In the 
situation reported here, it appeared advisable to use a regression equation 
based on validity coefficients obtained from an N of 60 and test intercor- 
relations obtained from an N of 212.? 

Using validity coefficients based on an N of 60 and intercorrelations on 
an N of 212, a maximum R of .718 was obtained from the following five 
test parts: TPR, T, PL, PR, PS. The R obtained from the usual tests 
of PR and T was .466. 

Regression equations based on maximum multiple correlations in- 
cluded negative weights. Such equations cannot be used in the usual 

? This procedure assumes that the larger group is comparable in all relevant respects 
with the smaller (criterion) group. The multiple correlation may be larger or smaller 
than that obtained from data limited to the criterion group, but the decrease in probable 


errors of test intercorrelations results in increased predictive accuracy when the regres- 
sion equation is applied to subsequent groups. 
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employment situation. Personnel men and mill supervisors see no logic 
in giving credit for slow speeds, and consequently refuse to use test results 
or use them only under pressure. Further, applicants frequently learn 
that scores can be improved by increasing the time required for certain 
test parts, with the result that such test parts lose their validity and the 
regression equation no longer can be used for predictive purposes. 

With reference to the expanded group of 212, and excluding test parts 
which contributed negatively to the R, a maximum coefficient of .611 
was obtained from the test parts TPR and T. Administration of the 
usual test parts (PR and T) resulted in an R of .466. Interpreted from 
the viewpoint of index of forecasting efficiency (2) the standardized pro- 
cedures have an efficiency of 10.8 per cent better than chance, and the 
two optimum procedures have an efficiency of 20.2%. 

A similar situation is found when the regression equation is limited to 
members of the criterion group. The R based on the optimum test parts 
(TPR and T) is .600, whereas that based on customary test parts (PR 
and T) is .461. 

Ziegler has reported a correlation of .57 between the Placing and 
Turning tests in a group of 500 cases (5). Intercorrelations (based on 
212 males) between the nine ways in which the test was administered in 
the situation reported here are given in Table 1. This table includes 
numerous correlations smaller than that between the original placing and 
turning tests, which in the group referred to here was .524. 


Table 1 


Intercorrelations of Nine Test Procedures 
N = 212 Men Applicants for Mill Jobs 





PR PL PS PA T tem. com TIS Te 





Phescoscn 642 653 523 524 -530 494 460 407 
ee 642 _ 685 579 550 369 585 531 503 
Mss ccanaed 653 685 _ 595 541 416 543 540 A73 
ae 523 579 595 — 501 422 545 .550 .680 
pena ew 524 550 541 501 — 464 520 551 549 
co re 530 369 416 422 464 — 598 .636 545 
a 494 585 543 545 .520 598 _ 733 590 
531 540 550 551 .636 .733 — .703 
pee 407 503 A73 .680 549 545 590 .703 — 





The mean and sigma of the nine test procedures used in this investiga- 
tion are given in Table 2. Data are based on seconds required for two 
trials following one untimed (practice) trial. 
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Table 2 


Means and Sigmas of Nine Procedures in Administering the Minnesota 
Rate of Manipulation Test 








(N = 212) 

Procedure Mean Sigma 
9 SO iy Pee A ee 123.2 10.1 
RE ey as Winter 126.5 10.1 
ict 05 4 tpawth aadee ena 71.5 6.9 
WURSee ad cis 6330's ae Cael 84.3 9.5 
Devibln's tw vce ke eee 98.9 9.3 
MPO iks kd be cha toneecee 152.6 15.7 
BAS c diseni'o > see anaed aes 158.6 15.8 
, PN es 86.9 9.4 
» PR = eee nos. 102.0 12.5 





The reliability of each procedure was determined by correlating the 
two timed trials and applying the Spearman-Brown prophecy formula to 
estimate the reliability of the sum of the two trials. Reliabilities (given 
in detail in Table 3) range from .874 to .955. The two lowest reliabilities 
(.874 and .908) are those for the two procedures used by Ziegler. 








Table 3 
Reliability 
(N = 212) 

Procedure Sum of 2 Trials 
its oe hod ee 874 
ec Oe ea eh eee ge .936 
ne tao oe ia? ile giae, Clgiginy & oni .929 
ao cit nines ote ata ie Wie .940 
PE eS 8, a Pye .908 
a cs os MRS & 4 hc 6 bah <<a .953 
aD |; “4 a aitinrs oe vo Ollie 40k aig a 6 934 
SS: soy ain ou ORME 0 4 Omen On 940 
ies, cx, -« cared dies ebay cae wanes 955 





The additional test administration procedures used in this investiga- 
tion were therefore found to be superior to Ziegler’s procedures in three 
ways: (1) validity coefficients were higher, (2) intercorrelations of test 
parts were lower, and (3) reliability was increased. Consequently, it is 
recommended that employers using the Minnesota Rate of Manipulation 
Test for employee selection investigate the advisability of administering 
the test in ways which are additional to, or in place of, the customary 
procedures. 
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The Effects of A Second Administration of 
An Employment Test 


Leonard W. Ferguson 
Field Training Division, Metropolitan Life Insurance Company 


It is a well known fact that repetition of a mental test very often 
results in an increase in test scores. Adkins (1), for example, has shown 
that repetition of the Otis, Kuhlman-Anderson or Morgan tests at inter- 
vals of one year causes the second score to be higher than that of com- 
parable individuals taking one of these tests for the first time. Terman 
and Merrill (3) report an increase in mean IQ when one of the alternate 
forms (L or M) of the Revised Stanford-Binet is given within a few days 
of the other, while Thorndike (4) has found similar results on a group 
intelligence test for an interval as short as one day. It is important, 
therefore, for the employment manager or personnel director to ascertain 
possible effects and implications of a second administration of an employ- 
ment test—particularly, when a situation exists in which the second score 
is likely to be one of the important hiring criteria. 

Such a situation may sound unusual, but it can and does exist. There 
are five life insurance companies in Hartford, Connecticut, four of which 
administer as part of their regular employment procedure one of the two 
mental alertness tests prepared by the Committee on Tests of the Life 
Office Management Association. It happens quite frequently, therefore, 
that an individual seeking employment will make application at several 
companies and thereby gain the opportunity of taking the same test a 
second, and sometimes even a third and fourth time. There are other 
individuals who may apply for work only at one company but upon a 
number of different occasions, so have the opportunity of taking a 
L.O.M.A. test more than once. The unique situation caused by these 
practices provides an excellent laboratory in which to determine the 
possible consequences of a second administration of an employment test. 

At the time this study was undertaken (June, 1942) data were avail- 
able for 243 individuals who had taken a L.O.M.A. employment test 
more than once. Of these 214 were tested at least once by company A, 
117 by company C and 52 by company P. Sixty-four applicants were 
tested twice by company A, but all others were tested only once by each 
company so that the first and second tests were administered by different 
examiners. One hundred twenty seven individuals took L.O.M.A. test 
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1A twice, 53 took L.O.M.A. 1A first, and L.O.M.A. 1B second, while 50 
took form 1B first and 1A second. Thirteen candidates took both tests 
the same day so that the order in which they were taken could not be 
determined, and no records were available for those who may have taken 
form 1B a second time. 

The first question of importance concerns the change in scores result- 
ing from the second test administration. Table 1 shows the distribution 


Table 1 
The Distribution of Score Changes Produced by Two Test Administrations 





1A Ist; 1A Ist; 1B Ist; 
Change 1A 2nd 1B 2nd 1A 2nd Total 





60 to 64 
55 to 59 
50 to 54 
45to 49 
40to 44 
35to 39 
30to 34 
25to 29 
20to 24 
15to 19 
10to 14 
5 to 9 
0 to 4 
-—5to —1 
-—10to —6 
—15 to —11 
—20 to —16 
—25 to —21 
—30 to —26 


Total N 
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of these changes for the 230 applicants for whom appropriate records 
were available. The mean increase for the total number of cases is 12 
points, with a range of ninety (+ 62 to — 28) and a standard deviation 
of 15. Two administrations of the same test (1A and 1A), however, 
result in a higher increase (16 versus 7 or 10 points) than the administra- 
tion of an alternate test form. The difference between 16 and 8, the latter 
figure representing the average for the 1A-1B and 1B-1A combinations 
together, is statistically significant (critical ratio = 4). The odds are 
only three out of 10,000 that such a difference could have occurred by 
chance. 
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Is there any relationship between the amount of the score change and 
the time elapsing between the two test administrations? In some in- 
stances the tests were administered the same day, but in others the inter- 
val was as much as five years. The mean interval, however, was slightly 
over eleven months. The correlation between the score change and the 
time interval is .03, or practically nil, so that the increase in score appears 
to be due to the repetition of the test rather than to any learning, practice 
or experience gained in the interval. 

How seriously is the picture of a group of applicants distorted by the 
second series of scores? Table 2 provides an answer. It shows the per 


Table 2 
Per Cents of Applicant Group that Exceed Various Critical S:ore Levels 





Critical or 1A Ist 1A 2nd /1A Ist 1B 2nd/1Bi1st1A2nd/ First Second 
Passing Score NY% NFAINFNFZAINGFANBZIN|F% NGF 





120 1310 5240; 918 13 27/1121 1937); 3314 84 37 
110 46 36 8163) 15 31 23 47) 1937 26 50/| 8035 130 57 
100 66 51 106 82 | 21 43 30 61 | 28 54 41 79 | 115 50 177 77 
90 93 72 118 92 | 31 63 34 69 | 37 71 47 90 | 161 70 199 87 
80 108 84 125 97 | 40 82 42 86 | 46 88 48 92 | 194 84 215 93 


Total N 129 129 49 49 52 52 230 230 

















cents of employees that, on the basis of the first and second scores, may 
be considered acceptable candidates for employment. The data show 
that the picture given by the second score is seriously distorted. For 
example, if a score of 100 is accepted as a passing mark the second series 
of scores indicates that 77 per cent of the group could be considered 
suitable for employment. This represents a serious and statistically 
significant error, however, for the first series of scores shows that only 
50 per cent of the candidates should be considered acceptable. Other 
critical score levels in the table yield similar results. 

The existence of more than one test record for the subjects of this 
study indicates that all were at one time refused or did not expect employ- 
ment and that all re-applied at some future date. Upon this second 
application 77 of these individuals were hired: 66 by company A, 1 by 
company C, 8 by company P and 2 by company M (a non-Hartford 
concern). How do these individuals differ from those who were still 
refused? Table 3 supplies the pertinent data. Both the employed and 
the not-employed groups show a gain in test score upon the second test 
administration, but the employed group appear to gain slightly more than 
the not-employed group (i.e., 15 versus 12 points). In this connection it 
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Table 3 
Mean L.O.M.A. Scores for the Employed and the Not-Employed Groups 





Mean L.O.M.A. Score 


Original Second 
Score Score Increment 








Employed group: 
108 125 17 
126 124 —2 
116 128 12 


111 126 15 


97 111 14 
97 106 9 
97 107 10 


97 109 12 





is noteworthy that the group finally employed average higher than the 
group not employed even on the first test. Therefore, although the second 
score was the criterion of employment, the distortion it caused did not 
result in the selection of a less qualified group than the persons who were 


refused employment. 

Ordinarily, applicants who secure scores less than 100 are not hired.'! 
That this is true can be partially verified by the fact that only two of the 
77 individuals who were ultimately hired secured scores lower than this 
on the second test. Since the second test was used as the criterion, 
however, it is possible that a number of applicants who secured scores 
under 100 on the first test were hired. Actually there were 27 such cases. 
It is highly probable that had the first score been known at the time of 
selection, all these individuals would have been denied employment. It 
should be interesting to ascertain the degree of success achieved by these 
employees. 

Data by which to judge the success of this group could be secured for 
only 19 individuals employed by company A. For comparison with 
these, the records of 32 persons having secured scores over 100 on the 
first test and employed by this same company were available. The 
criterion by which success was determined consisted of the combined job 
classification level and merit rating scheme used by company A. The 
employees were divided into a less successful group and into a more suc- 
cessful group each of which may be characterized as follows: The less 


1 This appears to be a common practice, although not a rigid rule, for most com- 
panies that make use of L.O.M.A. tests 1A or 1B in their employment program. 
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successful group consists of those engaged in “simple operations that 
require the use of a few definite rules, including jobs in which only a 
regular and definite change is made in the material handled and in no 
case including jobs in which a large variety of rules must be understood 
and applied.”” The more successful group of employees consists of those 
engaged in “operations that require the use of a large number of rules . . .”’ 
or who are engaged in jobs of a more complicated nature. 

Only three of the 19 individuals whose first test score was under 100 
could be classified into the more successful group. Sixteen (84 per cent) 
of them, therefore, are relatively unsuccesful. This result should be 
compared with that for the 32 employees who originally secured scores 
over 100, fifty per cent of whom were in the more successful group while 
the remaining fifty per cent were in the less successful group. The dif- 
ference cannot be attributed to experience because the mean length of 
service for both the successful and unsuccessful groups are sufficiently 
similar to rule out that possibility. With regard to employees securing 
under 100 on the first test, therefore, the indications of probable lack of 
success are almost completely verified. 

What do the second test scores appear to indicate? Since all 51 
employees secured scores over 100 on the second test, at least fifty per 
cent of them should have been in the more successful group, but only 37 
per cent of them are so classified. If the second score has the same sig- 
nificance as the first, those who secure over 100 on the second test should 
behave in the same way as those who secure over 100 on the first. This is 
manifestly not the case. It is quite evident that the second score does 
not possess the same significance as the first and that individuals hired 
upon the basis of the second score cannot be expected, as a group, to be 
as satisfactory as those hired upon the basis of the first test score. 

When only the second score is known, how much must the critical 
score level be raised in order to secure a group of employees that will 
achieve the same degree of success as those who pass the critical level of 
100 on the first test? Increasing it by the average gain made by all 
applicants (7.e., 12 points) is not sufficient. Upon the basis of the records 
of the 51 employees of company A only ten of the nineteen ‘‘undesirable”’ 
individuals would be eliminated. An increase to 125, however, will 
result in the elimination of all but four of those who originally secured 
scores under 100. The significant thing about this increase is that 25 
points represents approximately one standard deviation of the distribution 
of original test scores. Therefore, it’ is necessary to raise the passing 
mark from a level that 50 per cent of all applicants taking the test for 
the first time can ordinarily pass to one which only 15 or 20 percent can 
usually exceed. 
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Summary 


The records of 243 applicants who had taken a L.O.M.A. employment 
test more than once were examined in order to determine the average 
increase in test score and to discover the implications that this increment 
might have for the interpretation of the evidence that test scores provide. 
The mean increase for all applicants was found to be 12 points; those who 
were ultimately hired increased their scores slightly more than those who 
were not ultimately hired; and most of those individuals which the first 
test score indicated should not have been hired turned out to be relatively 
unsuccessful employees. 

In order to overcome the distortion produced by the second score 
when the first score cannot be secured it is necessary to raise the critical 
or passing level 25 points, an amount which represents approximately one 
standard deviation of the distribution of original test scores. 
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Some Comments on ‘‘The Prediction of Differential 
Achievement in A Technological College” 


Arthur E. Traxler 
Educational Records Bureau 


In the last number of the Journal of Applied Psychology, Professor 
William McGehee reported a study of the value of three tests for the 
prediction of marks in certain technical curricula. The tests were the 
American Council Psychological Examination, 1939 Edition; the Co- 
operative English Test, Form OM; and the Cooperative Mathematics 
Test, Form P. The curricula were agriculture, engineering, textile, and 
vocational education. The subjects were students enrolled in the fresh- 
man year of the North Carolina State College of Agriculture and En- 
gineering. The correlations were rather low, although not at variance 
with the findings of other investigators. The zero-order correlations 
ranged from .27, for scores on the mathematics test and marks in agri- 
culture, to .58 for the English test and marks in vocational education. 
The multiple correlations for all three tests with marks varied from .41, 
between the tests and marks in agriculture, to .65, between the tests and 
marks in vocational education. 

Professor McGehee carried the statistical treatment somewhat further 
than the zero order and multiple correlations. He presented data to show 
that through the use of the tests, the maximum error reduction in predic- 
tion was 24 per cent and that the least error reduction was less than 4 per 
cent. According to his data, the combination of the tests reduced the 
error of prediction only 15.2 per cent. Professor McGehee stated further 
that 71.9 per cent of the factors in all curricula were not accounted for by 
the tests used in the investigation. 

The concluding paragraph in the article was as follows: 

“It is entirely possible that the tests themselves, constructed on differ- 
ent populations, fail to measure certain aptitudes and knowledges re- 
quired for success in the particular situation in which they are being used. 
It is well known that tests which are successful in selecting workers in a 
given plant cannot be adopted unmodified for selecting workers in another 
plant, even though that plant may be manufacturing similar articles. 
The writer, therefore, suggests that a job analysis of the requirements for 
academic success in a given institution would reveal aptitudes and knowl- 
edges needed there. It is entirely possible that tests constructed to 
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measure the specific characteristics thus uncovered would give coefficients 
of non-determination of much less magnitude than 71.9.” 

Professor McGehee has rendered a service through his emphasis on 
and illustration of the point that prediction based on a certain battery 
of tests is not equally good in all types of curricula. He has interpreted 
his data with commendable caution. Nevertheless, before counselors in 
technological institutions decide to abandon tests of the standardized type 
in favor of tests constructed to meet the requirements for academic success 
in a given institution as revealed by a job analysis along the line suggested 
by Professor McGehee, it would be advisable for them to consider certain 
limitations to the data in a study of this kind. 

As already indicated, the criterion of success employed in the study 
was a point average ratio based on marks assigned by instructors. This 
of course represents a commonly used criterion, since a better index of 
scholastic success usually is not available. Nevertheless, in the interpre- 
tation of the correlations of test scores with marks, the unreliability of 
the criterion should be kept constantly in mind. Professor McGehee does 
recognize that marks are a fallible criterion of success but he probably does 
not give full weight to this point. The reliability of teachers’ marks 
seldom exceeds .65, yet one of the multiple correlations—that for the test 
scores and point average. ratio in vocational education-—reaches .65. It 
thus appears that this particular correlation is fully as high as one could 
expect in the absence of evidence that the marks employed in this study 
were unusually reliable. The other correlations are, however, consider- 
ably lower. 

The battery of tests employed in the study obviously provides only a 
limited sampling of aptitude. The American Council Psychological 
Examination is what is known as a two-axis test. It consists of six parts, 
three of which are combined to yield a subtotal score for linguistic ability 
and the other three of which are brought together to form a quantitative 
subtotal. The test also yields a total score, which is the sum of the 
linguistic and quantitative sections. The Cooperative English Test of 
course depends mainly upon linguistic, or verbal, ability, and it may be 
assumed that the Cooperative Mathematics Test is heavily saturated with 
a quantitative factor. Thus, it appears that the battery of three tests 
measures aptitude and achievement in just two areas, the verbal and 
the numerical. 

It seems probable that very few counselors would expect that these 
tests alone would provide very accurate prediction of success in techno- 
logical curricula. It is obvious from a casual examination of the general 
areas they measure that they do not cover by any means all the abilities 
and types of achievement that enter into success in such a subject as 
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agriculture or engineering. Each of these tests does contribute something 
to prediction in the technological field, as the study clearly shows. It 
seems to the writer that what is needed is not so much replacement of 
these tests by an entirely different measurement approach but rather the 
avoidance of the duplication in the abilities tested and the supplementing 
of the battery by other types of tests. 

In view of the rather high correlation that has been found in other 
studies between the linguistic score on the American Council test and the 
Cooperative English score, and between the quantitative score on the 
American Council test and the Cooperative Mathematics Test, one could 
justifiably exclude either the American Council Psychological Examina- 
tion or the combination of English and mathematics tests from the pre- 
dictive battery. One could then add a test in science, such as the Co- 
operative General Achievement Test in Science, a test of mechanical 
aptitude, and a test of background information in the particular cur- 
riculum in which success is being predicted. 

It is believed that a judicious selection of the predictive battery from 
the tests now available would considerably raise some of the multiple 
correlations reported in the study even with a criterion as unreliable as 
teachers’ marks. Data are not at hand to support this hypothesis, but 
before the guidance people in individual technological institutions embark 
upon the highly specialized job of test construction, it would seem desir- 
able for them to experiment with a somewhat more varied and appropriate 
test battery than the one utilized in Professor McGehee’s helpful study. 

Near the end of the article, the following statement appears: “It has 
been asserted by counselors and others that, even though psychological 
tests fall down in mass prediction, they are of value in predicting the 
performance of individuals who stand either high or low on the tests.” 
To test this hypothesis, Professor McGehee first divided a group of 210 
students into three groups on the basis of scores on the American Council 
Psychological Examination. These groups consisted of the students in 
the upper three deciles, the middle four deciles, and the lower three deciles. 
Correlations were then computed between the test scores and point aver- 
ages of the students in each group. A comparison of the correlations 
showed that they were very low and of similar magnitude. The correla- 
tions were as follows: High Group, .25; Middle Group, .15; Lower Group, 
.15. It was concluded that the hypothesis was not correct because higher 
correlations were not obtained for the upper and lower three deciles than 
for the middle four deciles, and because all three correlations were so small 
as to show little more than a chance relationship. 

The statistical procedure employed in this connection does not seem 
altogether appropriate for an appraisal of the hypothesis under considera- 
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tion. Because of the restriction in the variability within each of the 
three groups, one would expect the correlations to be very low. The 
magnitude of the correlations for the higher three deciles and the lower 
three deciles as compared with the middle four deciles neither proves nor 
disproves the hypothesis. Even if these correlations were zero, the tests 
might still be of considerable value for prediction if the students who fall 
within the higher group on the tests are in the higher group on the basis 
of point averages and if the students who are in the lower group on the 
tests are also in the lower group on the basis of marks. While it is true 
of course that the tests would be of greater value for precise prediction if 
there were high correlations of scores and marks within these groups, 
such precision is by no means an absolute requirement in the practical 
work of counseling. If one can say with confidence that students who 
are relatively high in test scores will also be relatively high in point aver- 
ages, and that those who are low in test scores will also be low in point 
averages, he has a useful basis for counseling even though his measuring 
instruments may not be reliable enough to enable him to predict more 
precisely within each of these groups. This is not to say that the Ameri- 
can Council test has these values for prediction of success in technological 
courses, but the statistical procedures employed in the article do not 
disprove the hypothesis that such values may attach to that test. This 
comment, however, is on a minor aspect of the article. 


The study, as a whole, would have been somewhat more meaningful 
if standard deviations had been reported for the scores of the various 
groups on the different tests. 


Received February 15, 1943. 











Likes, Dislikes, and Vocational Interests* 
Ralph F. Berdie 


University of Minnesota 


Vocational interests, as defined by the Strong Vocational Interest 
Blank, are, for the most part, expressions of liking and disliking. Atten- 
tion has hitherto been focused upon the objects of this liking and disliking. 
The interdependency between these objects cannot be denied. Thorn- 
dike, (9) after an extensive study on the interests of adults, concluded, 


‘All these records go to show two general facts. First, there is great 
specialization of interests. Second, such group factors as appear seem 
more related to characteristics of the situation responded to than to 
unitary ‘Traits’ in the persons. Music, sport, friendly intercourse and 
talk, fiction and drama are certainly more obvious, and are probably more 
significant, as organizing causes than conscientiousness, pugnacity, 
exercise and the like.” 


Lorge (3) has suggested, nevertheless, that individuals have habitual 
ways of responding in a situation, regardless of what the objects being 
responded to are. He found a positive correlation between the number 
of “‘yes’s”” marked in the Bernreuter test, the number of checks on the 
Thurstone Inventory, the number of “L’s” marked on the Strong test 
and the number of “1’s” and “‘2’s” on the Thorndike test. Similar rela- 
tionships were found between the negative reactions on the tests and 
between the doubtful reactions on the tests. 

Terman and Miles (8) found a relationship between sex and the tend- 
ency to mark extreme answers on their Interest-Attitude test. Women 
were more prone to check answers at either extreme while men tended 
to mark answers in the middle. 

Rulon, (5) using the unrevised Strong Interest Blank investigated the 
effect upon the interest profile of systematically answering the test by 
marking all the “likes” and responses in the left hand column and all the 
“dislikes” and responses in the right hand column. He found that mark- 
ing all the likes gave “A” on the keys for Y.M.C.A. secretary, minister, 
vacuum cleaner salesman and vacuum cleaner sales manager and scores 
of ‘‘B” on keys for production manager and life insurance salesman. 
Marking all the “‘dislikes” gave scores of ‘‘A” on keys for artist, physician, 


* This study is one of a series of studies in process on clinical problems of interest 
measurement at the University of Minnesota Testing Bureau (1) (6). 
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advertising man, lawyer and journalist and scores of ‘“B”’ on the keys for 
chemist, production manager, Y.M.C.A. secretary, teacher and vacuum 
cleaner salesman. Thus the tendency to like or dislike results in interest 
patterns characteristic of men in certain vocational fields. 

Tendencies to like or dislike appear fairly stable over periods of three 
weeks, one year, two years and three years, according to a study by Rock 
(4) on 700 edllege and high school students who took the Strong Interest 
Blank and later took it again. After a period of three weeks, the average 
item per cent of agreement (percentage of identical responses on both 
occasions) was 66 per cent. After a period of three years, this mean 
consistency was 58 percent. The older students, the better students and 
the more intelligent students tended to be slightly more constant. No 
relationship existed between the constancy of the item and its discrimina- 
tive value on the occupational keys. 


Methods and subjects 


In the present study, the relationship between the number of likes and 
dislikes marked on the Strong Vocational Interest Blank by 411 men and 
other variables was observed. These variables included high school rank, 
college aptitude, college grades and scores on tests purporting to measure 
social adjustment, emotionality and morale. 

In the fall of 1939 freshmen entering the College of Science, Literature 
and the Arts of the University of Minnesota were given an extensive 
battery of tests. These tests included the Strong Vocational Interest 
Blank, the Minnesota Personality Scale and other tests. High school 
percentile ranks, scores on the American Council on Education Psycho- 
logical Examination, form 1937, and first year college honor point ratios 
were also available and complete data were obtained for 411 men. 

The number of likes and dislikes marked on each side of the answer 
sheet of the Strong blank was counted by inserting the answer sheets into 
the International Business Machines Corporation test scoring machine. 
By using the appropriate scoring stencil, the number can be read directly 
from the dial of the machine. The reliability of this score was determined 
by finding the correlation for the number of dislikes marked on the 
first side of the answer sheet and the number marked on the second 
side and the same correlation for the numbers of “‘likes’’ marked. The 
correlation for the number of likes is .66, corrected by the Spearman- 
Brown formula to .79. As this is not truly a split-half correlation, for an 
unequal number of items appears on each side of the answer sheet, this 
correction is only a rough approximation of the estimated reliability of 
this score. The correlation for the number of dislikes is .65 corrected to 
.79. Thescores thus obtained have sufficient stability to warrant their use 
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in statistical comparisons and the correlations indicate that on the Strong 
test, some individuals show tendencies to like items while others show 
tendencies to dislike them. This can be considered one aspect of 
personality. 

To determine the effects of systematically marking likes, dislikes and 
indifferents on the revised Strong blank for men and on the blank for 
women, scores were obtained on both occupational and non-occupational 
keys for blanks which had all the likes, dislikes, or indifferents marked. 


Results 


To determine to which factors liking and disliking were related, the 
number of likes and the number of dislikes checked by individuals were 
correlated with high school percentile rank, first year college honor point 
ratio, score on the American Council Examination and scores on the 
morale key, the social adjustment key and the emotionality key of the 
Minnesota Personality Scale. These correlations are presented in Table 
1. High scores on the personality scale are indicative of better morale, 
greater sociability, and greater emotional stability. 


Table 1 


Correlations Between Number of Likes and Number of Dislikes Checked on the 
Strong Interest Blank and other Variables 





Correlation Correlation 
Other Variables ‘ with Likes with Dislikes 





High school percentile | —.15t 

American Council Exam J —.10* 

—.18t 

c —.15t 

Social adjustment j —.16+ 
Emotionality ‘ — .06 





* Significant at 5 per cent level of probability. 
t Significant at 1 per cent level of probability. 


Of the twelve correlation coefficients, seven would deviate from zero 
to this extent by chance alone less than one time in one hundred and two 
others would occur by chance less than five times in one hundred. The 
extent to which a person likes or dislikes items on the Strong blank is 
related to other personality, achievement, and ability factors. This rela- 
tionship is small, although stable.- It does have many interesting 
implications. 

The person who likes more items tends to be a better student both in 
high school and in college, although he does not tend to possess more aca- 





Likes, Dislikes, and Vocational Interests 183 


demic ability. He tends to be more social and to have more social skills, 
as defined by the test used, and tends to have better morale. The person 
who dislikes many items tends to be a poorer student, has less ability, 
and has a less satisfactory social adjustment and morale, as the negative 
correlations demonstrate. 

Letter 


IX. 


Non-occupational Interests 


Fig. 1. Summary of ratings on Strong vocational interest blank for men when all the 
likes are marked. 


Liking things, on the basis of this, seems to be a socially approved 
behavior and the more widespread a person’s likes and the more restricted 
his dislikes, the better his chances appear of making a satisfactory social 
adjustment. The generally enthusiastic individual is the one most likely 
to be successful in the social area. Such generalizations can be applied 
to individuals with only the greatest of caution, in light of the small 
correlations. 
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In order to determine the effect of liking and disliking tendencies upon 
the interest profile, the answer sheets on which all the likes, dislikes, or 
indifferents were systematically marked were scored for 34 occupational 
keys and two non-occupational keys for the men’s blank and 15 occupa- 
tional keys and one non-occupational key for the women’s blank. 


letter 


II. 


Itt, 
Iv. 


v. 


VII, 


Vill, 


Ix, 


x. 





Non-occupational Interests 


Fig. 2. Summary of ratings on Strong vocational interest blank for men when all the 
dislikes are marked. 


The results with the revised blank are essentially in agreement with 
those found on the unrevised blank by Rulon, although here they are more 
clearcut. The profiles obtained by marking all the likes, dislikes and 
indifferents on the men’s blank are presented in Figures 1-3. 

Marking all the likes produces a profile characteristic of that obtained 
by people in the social service fields. This profile is also similar to that 
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obtained by people in the business detail field, the skilled trade field and 
production managers. Marking all the dislikes on the test produces a 
profile characteristic of people in the verbalistic-linguistic fields such as 
journalism, law and advertising and people in the scientific-creative field, 
presidents of manufacturing concerns and real estate salesmen. Marking 


letter 


xI, 


Non-occupational Interests 


Fig. 3. Summary of ratings on Strong vocational interest blank for men when all the 
indifferents are marked. 


all the indifferents produces a profile characteristic of the people in the 
skilled trades, some of the people in the social service field and musicians. 

People who like a preponderance of items tend to have feminine 
interests, in terms of the masculinity-femininity scale. People who dislike 
the items also tend to have feminine interests. People who are indifferent 
to the items of the test tend to have masculine interests. This is in agree- 
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ment with the findings of Terman and Miles. Marking extremes on tests 
is a feminine characteristic. 

People who like many items will have occupational level scores slightly 
below average. People who dislike many items will have extremely high 
occupational level scores, while people who are indifferent to these items 
tend to have very low occupational level scores. Marking many indiffer- 
ents indicates that either the individual does not respond affectively in 
terms of liking or disliking to many situations he has encountered or that 
a limited range of experience has resulted in a paucity of such situations 
and therefore, knowing nothing about them, he cannot respond emotion- 
ally and must therefore check himself as indifferent. People coming from 
limited environments, from homes and schools where opportunities are 
not available to secure extensive experience, might have low occupational 
level scores. People from homes and schools offering many of these 
opportunities, people who have traveled and read much and have had 
many different types of friends will perhaps tend to have higher occupa- 
tional level scores. 

The profiles obtained by systematically marking the likes, dislikes, 
and indifferents for the women’s blank are presented in Figures 4-6. 
Marking the likes on this blank provides a profile characteristic of women 
in the general occupations of business and nursing and to a lesser extent, 
women in the social service fields, including law. Men in business and 


letter 


Fig. 4. Summary of ratings on strong vocational interest blank for women when all 
the likes are marked. 
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social service were also characterized by many likes, it will be remembered, 
although men lawyers were characterized by many dislikes. In women, 
liking appears to be a distinctively feminine characteristic, as shown by 


letter 


Fie. 5. Summary of ratings on Strong vocational interest blank for women when all 
the dislikes are marked. 


letter 


Fria. 6. Summary of ratings on Strong vocational interest blank for women when all 
the indifferents are marked. 
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the high femininity score. This offers one clue to the difficulty encoun- 
tered in obtaining as distinct differentiation among the vocational inter- 
ests of women as compared to that obtained among the interests of men. 
Women are more likely to accept and like most objects and activities in 
their enviroment and to show fewer avoidant reactions. They perhaps 
have on these blanks more positive interests than men but fewer negative 
ones and these negative interests or dislikes might provide the reason for 
obtaining the sharper distinction among the interests of men. 

As among men, marking all the dislikes provides a profile character- 
istic of women in the verbalistic-linguistic-creative field, authors, li- 
brarians, artists and physicians. The obtained pattern is very clear-cut 
and well defined. Women with many dislikes also tend to have more 
masculine vocational interests. 

Women indifferent to many items of the Strong blank are not found 
concentrated in any one special group. General office workers and 
teachers of mathematics and science are indifferent. Women with neither 
many likes nor dislikes tend to have very masculine interests as shown 
by the masculinity-femininity score of 1. 


Summary 


The extent to which items are liked on the Strong Vocational Interest 
Blank for men is positively correlated with school achievement and other 
personality test scores purporting to describe social adjustment and 
morale. These correlations are stable but are too small to be of any use 
in predicting adjustment. The extent to which a person likes or dislikes 
the items on the blank is also closely related to his vocational interests. 
Emotional acceptance of their surroundings is typical of people in socially 
directed occupations. Rejection of these surroundings, perhaps an ex- 
pression of cynical disillusionment, is characteristic of people in those 
occupations usually considered as revealing the realities of life with great 
emphasis. General indifference to these surroundings is characteristic of 
those people in the more mundane occupations which perhaps do not 
inspire too much enthusiasm on the part of the people in them. Perhaps 
the people are in those occupations because they do not require great 
vocational or other stimulations. 

It must be remembered that the obtained results are dependent upon 
the items of the Strong Vocational Interest Blank and that these items 
are the stimuli to which the individuals respond in demonstrating their 
emotional reactions. Very likely a list of stimulus items could be 
established that would provide us with a very different set of results. 
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Developing an Industrial Merit Rating Scale 


Joseph E. Zerga 
Social Security Board, Bureau of Employment Security 


American industry is gradually accepting the fact that the periodic, 
objective, and analytical rating of employees is as important and as es- 
sential as the scientific study of production processes. Dunford (15) 
rightly believes that the effective utilization of human effort necessitates 
the evaluation of that effort in the work it is doing. 

In 1939, Starr and Greenly (34) sent questionnaires to 64 companies, 
representative of the major branches of industry in the United States, 
requesting samples of their merit rating scales and information concerning 
their application and use. From the 44 returns received they estimated 
that only approximately one-third of all industrial organizations are 
making use of merit rating techniques. 

To illustrate the methods used in developing a rating scale, what a 
rating scale consists of, and how a rating scale is used, the merit rating 
programs of a number of large industrial organizations will be briefly re- 
viewed. Following t’1is review consideration will be given to those steps 
leading up to the establishment of a sound rating program, the benefits 
to be derived from the established program, reasons why some programs 
are unsuccessful, and the employee-rating scale-management relationship. 

In 1934, The Atlantic Refining Company (8) initiated a program of 
job analysis and evaluation at its main refinery in Philadelphia. The 
purpose of this program was to remedy the apparent inconsistencies in 
non-clerical wage and salary scales, and to fulfill the recognized need for 
a systematic means of adequately adjusting wage and salary rate disagree- 
ments. For a number of reasons, including the necessity of using a 
technique applicable to the salaried as well as the hourly wage rated 
group, the “Factor Comparison Method” of job evaluation was selected. 
Following a detailed analysis and evaluation of the hourly wage rated 
positions a study was made of the salaried key positions. Information 
regarding the salaried positions was first obtained through a questionnaire 
form which was supplied to each salaried employee and the immediate 
supervisor for each position or group of positions. The form called for 
such information as a description of daily, regular, periodic and occasional 
duties; the training grade for the position, i.¢., training time; desirable 
and undesirable working conditions; responsibility; hazards; working 


hours, etc. Following a thorough examination of the questionnaires, job 
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analysts interviewed as many employees as was necessary to secure further 
supplementary information concerning each job. This information was 
then organized and written up on a job description or job specification 
form. The analyzed salaried key positions were then ranked by the job 
analysts for the five following critical factors: Mental Effort, Skill, 
Physical Effort, Responsibility, and Working Conditions. The rankings 
were then averaged, and if necessary readjusted. The salaried positions 
were then reduced to hourly wage rate positions for each of the five 
critical factors so as to establish measuring sticks for other salaried posi- 
tions. Following the establishment of grade and salary ranges for each 
position, a detailed labor market survey was made to compare labor 
market rates and the company’s salary scale. 

The Jones and Laughlin Steel Corporation’s employee merit rating 
scale (3) contains ten qualifications to be rated: Job Performance (quality, 
quantity, waste, number of errors and overall efficiency of the employee); 
Overall Knowledge of the Job (knowledge regarding the present job and 
other closely related jobs); Industry and Depe ability (employee’s effort 
and application to his job, attendance and punctuality, reliability and 
dependability) ; Aptitude and Ability to Learn (how quickly the employee 
can successfully perform the duties of a new job, how quickly he learns, 
grasps new ideas and retains information); Initiative (employee’s in- 
genuity, self-reliance and resourcefulness in thinking, planning and carry- 
ing out the duties of his job); Judgment (intelligence, logic and thought 
the employee uses in relation to his job); Disposition and Attitude (soci- 
ability with fellow employees, attitude toward the boss and the corpora- 
tion’s plans, policies, objectives and interests); Personality (impression 
the employee makes on others); Safety (employee’s reaction and ad- 
herence to the safety program); and, Health and Physical Condition 
(employee’s general health and physical condition in relation to his work). 
The rater rates the employee along a horizontal scale for each qualifica- 
tion, the ratings being either Poor, Below Average, Average, Above 
Average, or Excellent. Each rating is further broken down into numerical 
units; for example, a rating of Poor consists of five numerical units placed 
in boxes or blocks and ranging from low (9 units) to high (45 units) in 
five unit steps. The rater places a check in the box containing the ap- 
propriate numerical score which he believes applies to the employee being 
rated. To prevent misinterpretation of the qualifications and ratings by 
the rater, each qualification and rating is carefully defined on the rating 
sheet. 

The Farrel-Birmingham Company, Inc., (31) of Ansonia, Connecticut. 
uses a rating plan which is essentially the same as that formulated by the 
Metal Trades Association. There are eleven evaluating factors or quali- 
fications on the job rating sheet, each being classified by degrees. Point 
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values are then assigned to each factor for the first degree, the second 
degree carrying twice the point value of the first, the third degree three 
times as many, etc. The eleven factors are: Education, Experience, 
Initiative and Ingenuity, Physical Demand, Mental or Visual Demand, 
Responsibility for Equipment or Process, Responsibility for Material or 
Product, Responsibility for Safety of Others, Responsibility for Work of 
Others, Working Conditions, and Unavoidable Hazards. 

Each completed rating is reviewed by the department superintendent, 
the department manager, and finally by the vice-president and manu- 
facturing manager. The company states the rating plan has benefitted 
it in a number of ways. In the first place, it has eliminated favoritism 
in the establishment of wage rates. Secondly, it pays the individual 
worker for his actual skill. Thirdly, it has improved employee morale. 
Lastly, it has made possible the intelligent comparison of the company’s 
wage rates with those of other companies. 

There are six qualifying factors that determine the worth of an em- 
ployee to the McKinsey, Wellington and Company (12) of New York. 
They are: 1. Performance as to quantity produced (75); 2. Performance 
as to quality of output (5); 3. Ability to prepare a machine or work place 
and make set-ups (5); 4. Versatility as to product, process, or machine 
(5); 5. Length of service (5); and, 6. Management’s rating based upon 
safety, conduct, sobriety and attendance (5). The figures in parentheses 
after each factor indicate the total maximum number of points allowed 
for each factor during the rating. For example, a workman who requires 
no assistance in setting up his machine and preparing for work receives 
a five point total under factor three; whereas, a workman who has to have 
his machine set up for him and his workplace prepared for him receives 
no points under factor three. Item, or factor, five is rated upon the basis 
of one point for every year of continuous service up to the maximum of 
five points for five years. The total number of points the employee 
receives on his rating determines what percentage of the base rate for the 
group in which he falls will be his hourly wage rate. Each employee’s 
merit rating is reviewed every thirty days. 

The Carnegie-Illinois Steel Corporation (38) uses a rating scale that 
was developed from the opinions of 406 employers, utilizing the traits 
they considered most essential to success after graduation. The rating 
scale is in graphic form, first listing the trait name and its definition, then 
the types of behavior by which the trait is judged. The traits rated are: 
Judgment and Common Sense; Leadership; Initiative; Intelligence; Tact; 
Physical Bearing and Neatness; Force; and, Attention to Work. The 
types of behavior used to rate each trait are: Inferior, Unsatisfactory, 
Satisfactory, Excellent, and Superior. The rating scale serves the cor- 
poration in the following ways: It develops the employees’ personality, 
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makes comparisons between employees, shows the progress of each in- 
dividual employee, and effectively improves employee morale. 

The construction of an adequate and reliable rating scale should be as 
carefully planned as a good test, not haphazardly thrown together. 
Consequently, management must take into consideration a number of 
important preliminary steps leading to the establishment of a rating pro- 
gram. In the first place, every job in the plant should be subjected to a 
thorough job analysis. Secondly, all plant jobs or positions having the 
same duties, responsibilities, and qualification requirements should be 
classified under one job title. Lastly, job characteristics that may be 
objectively evaluated (13) should not be included in a rating scale. These 
preliminary steps will facilitate the construction of a rating scale that 
may be adapted to the actual duties of either a single employee or a group 
of employees. 

Merit rating systems benefit management in a number of different 
ways: 1. They supply a basis for promotions, demotions, and transfers. 
2. They determine the training needs for various employees within the 
various departments of the plant. 3. They increase plant efficiency by 
improving job performance and conversely reducing plant cost. 4. They 
prevent grievances between employees and between employees and 
management. 5. They increase the analytical ability of the supervisors 
in relation to job performance. 

There are a number of factors requiring careful consideration in any 
merit rating program, neglect of which will result in unreliable ratings and 
the subsequent failure of the entire program. The first of these is the 
tendency of a rater to rate an employee solely on the basis of a few out- 
standing satisfactory or unsatisfactory traits. This tendency toward 
biased rating is termed the “halo effect”’ by Thorndike (36). There are, 
however, a number of methods by which this tendency may be overcome 
to a certain extent. It may be minimized by rating all of the employees 
in one department, on similar jobs, or under one supervisor, one one 
specific trait at a time. Another method is to arrange the rating chart 
or scale in such a manner that it will not be possible for the rater to give 
employees desirable or undesirable ratings in a routine manner. The 
second factor requiring careful consideration is the possibility of misusing 
the weighting technique (37), resulting in the subsequent misinterpreta- 
tion of the total scores obtained from the weighted items of the rating 
scale. A third factor is the failure to determine the reliability of the 
ratings (37), including the danger of pooling unreliable with reliable 
ratings. Finally, if separate rating scales have not been prepared for 
individual or closely related jobs, careful consideration must be given to 
the differences existing between departments and jobs when interpreting 
the results of ratings. 
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It should be emphasized, however, that the total score obtained on a 
rating should not be the only factor to be considered in promotions, 
transfers, demotions or dismissals. The results of reliable rating scales 
should be considered supplementary information to other and more ob- 
jective personnel records, such as, production, job tenure, etc. 

Immediately following the rating of employees steps should be taken 
to utilize the results to the fullest possible extent. Following the analysis 
of each individual rating form, and its comparison with other personnel 
records, the results should be made known to the employee. Notifying 
the employee of his rating requires much tact on the part of the supervisor 
or interviewing official. To avoid possible grievances and a detrimental 
affect to employee morale it has been recommended (37) that letter 
grades be substituted for numerical scores. The method by which an 
employee is handled by his superiors will determine his future attitude 
toward the company and management, and also the amount of effort he 
will put forth in an endeavor to correct any of his individual shortcomings. 

The following bibliography of 39 titles has been carefully selected to 
cover all aspects of industrial merit rating programs. 
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A Note on the Experimental Study of the 
Appraisal Interview 


Edwin S. Shneidman 
City Civil Service Commission, San Diego, California * 


In the selection or promotion of candidates for positions in the public 
service, the personnel technician must make a sharp distinction between 
the written and oral parts of the examination. The latter has the unique 
function of appraising the candidate’s general fitness or ability in terms 
of his personal qualifications for a specific class of employment rather 
than his knowledges or skills related to that work. The fact that such 
appraisals can be made only in the oral interview, gives it a particular 
importance and makes the present lack of knowledge of the underlying 
factors which determine its course especially conspicuous. 

The purpose of the present paper is to suggest a method for the ex- 
perimental study of the problems of the appraisal interview. The experi- 
ment to be reported grows out of the difficulty which confronts the inter- 
viewers (usually a board of two or more persons) of creating a situation 
in which the appraisal of general fitness can be made without asking 
questions which relate to the candidate’s specific knowledge, information, 
or skill with respect to the job. 

Specifically, the purpose was to set up two interview situations: in 
one the candidates were questioned in the usual manner by an interview 
board, while in the other they were asked to react not to formal questions 
but to a set of stimuli that were neutral insofar as the job in question was 
eoncerned. In both cases the members of the interview boards appraised 
the candidates in terms of a specific set of traits. The neutral stimuli 
selected for the experimental interviews were the Rorschach series of ink 
blot cards. It should be emphasized that the intent was to use the 
Rorschach material because it presented a neutral stimulus situation for 
the candidates. This study was in no way concerned with the theory or 
methods of interpretation usually associated with the Rorschach tech- 
nique. Indeed, the oral raters for the experimental interview board ' 
had never seen nor heard of the Rorschach procedure before. 

* The author, now on military leave of absence, conducted the present study while 
holding the position of Personnel Technician with the City of Los Angeles, California. 

1 The original oral interview board consisted of a professor of psychology and two 
employment interviewers from two California State Employment Agencies. The experi- 
mental interview board included a professor of public personnel administration, a 


principal of a large business school, and a branch head librarian. 
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The use of ink blots in the interview situation creates certain definite 
changes in the role of the interviewer. Strictly speaking, he no longer 
interrogates in a give-and-take conversation, but he limits his behavior 
largely to the observation of each candidate. The verbal responses of 
each candidate—except those on being introduced to and taking leave of 
the oral board—are confined to his spoken interpretations of the series of 
ink blot cards. The situation changes in character from an interview 
which is social and reciprocal to an appraisal which is relatively static 
and uni-directional, lacking the shifting inter-play of conversation. 

There were two hypotheses on which this study was premised: (1) the 
more uniform stimulus situations engendered by the use of the ink blots 
would result in greater uniformity of ratings among the judges in the 
experimental interviews; and (2) the standardization of the interview 
procedure would eliminate the criticism that a series of interviews does 
not present nearly identical situations to all candidates. 


Procedure 


The ten standardized 7144 by 9% inch cards which constitute the es- 
sential part of the Rorschach test, contain ink blots of different shapes. 
The figures on the cards are partly black and gray, partly colored. In 


adapting the procedure of administration of the Rorschach technique to 
the interview situation, the subject was seated to face the group of three 
oral raters, one of whom handed him the cards. Each candidate in the 
experimental interview, after being introduced to the members of the 
oral board, was instructed by one member of that body: 


“This interview is probably a little different from some you may have 
taken. Rather than ask you questions, we are merely going to ask you 
to interpret ten ink blots. You will be handed the blots one at a time 
and asked: ‘What might this be?’ When you have finished with each 
blot you are to return it to us and we shall hand you the next one. There 
are no other instructions or rules. Here is the first card. What might 
this be?” 


When questions were asked by the candidate, he was answered with: 
“That is entirely up to you.” No information regarding the use, pur- 
poses, or possible significance of the interview was given any candidate 
at that time. 

The subjects were candidates in an examination for the class, Deputy 
Clerk—Deputy Marshal, given approximately one month before the ex- 
perimental interviews by the Los Angeles City Civil Service Commission. 
Only those employed at that time by the Commission were used as 
subjects in the second interviews. 
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Two days befcre the experimental interviews and a few days after the 
eligible register for Deputy Clerk—Deputy Marshal became legal, the 
subjects, who had all been bona fide and serious candidates in the ex- 
amination, were notified by the Director of Examinations that protests 
on the oral interview grades had necessitated the re-scheduling of the 
interviews with another oral board. While this kind of situation is highly 
improbable according to the rules under which the Commission operates, 
it may be assumed that the subjects took the Director’s words in good 
faith and did not display significantly less concern, incentive, or legitimate 
effort in the experimental interview than they did in the first. 

The traits rated were Personal Appearance, Mental Alertness, Ability 
of Self-Expression, Manner and Bearing, Ability to Get Along Well with 
Others, Adaptability to New Work, and Summary Rating. The degrees 
of endorsement, made on a graphic rating scale, were Inadequate, Border- 
line, Suitable, Good, and Outstanding. 

At the completion of each of the fifteen experimental interviews, the 
candidate was directed into an adjoining room where he was informed 
that the interview was not a re-test and was asked not to reveal the nature 
of the interview. He was then given a page on which he was requested 
to check one of three possible answers to three questions relating to a 
comparison of the conventionally conducted interview and the interview 
with ink blots; to state the number of civil service interviews he had taken 
previously; to indicate whether or not he had ever seen the blots before; 
and to write any comments he felt inclined to make. 

At the end of the day, after all the subjects had been interviewed, the 
oral examiners were asked to fill out a special questionnaire on which they 
indicated the degree of their general endorsement of this type of inter- 
view and wrote their comments relative to the use of ink blots as a means 
of obtaining an estimation of candidates’ personal qualifications. 


Results 


A summary of the checks made on the Report Sheet for Candidates 
by the fifteen subjects is presented below: 


Question 1. “. . . I like this interview 
(a) better than the conventional interview 
(b) neither more nor less 
(c) not so well as 
Question 2. “As compared to the conventionally conducted 
interview, this type of interview put me 
(a) more at ease 
(b) equally at ease 
(c) less at ease 
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Question 3. “In regard to accuracy and fairness, I would rate 
this type of interview as 
(a) superior to the customary interview 
(b) not significantly different from 
(c) inferior to the customary interview” 
Question 4. The candidates indicated that they previously had 
taken from two to ten civil service interviews. 
Question 5. Four of the fifteen candidates stated that they had 
seen the blots before. 


The total of the 1(a), 2(a), and 3(a) responses—“‘better,”’ “more,” 
and “‘superior’”’—is 11; the total of the 1(b), 2(b), and 3(b) responses— 
‘“‘neither more nor less,” “equally,” and “‘not significantly different from” 
—is 12; and the total of the 1(c), 2(c), and 3(c) responses—“not so well 
as,”’ “less,” and “‘inferior’’—is 22. Therefore, 11 responses indicated that 
the interview with ink blots was preferred to the conventional interview; 
12 indicated that it was neither better nor worse; while 22 indicated that 
it was thought inferior. 

On the qualitatitive side, certain important points were made by the 
candidates. The following quotations are extracts from the comments 
written by the subjects after they had been interviewed. 

1. Two of the candidates misinterpreted the experiment as a psycho- 
analytical study. 


“Frankly, I do not feel quite competent to judge this type of interview 
as I really do not know just what they are driving at. It seemed more 
like a Freudian experiment to me than anything else.”’ 

“Because I am aware of the limitations of the psychoanalytic tech- 
niques and of the Rorschach test in particular, I would hesitate to endorse 
the use of such techniques in civil service examinations. . . .” 


2. Two of the candidates mentioned some advantages of using this 
type of subject matter in the oral interview. 


“It is an improvement, from my point of view, to give the candidate 
some objective problem to solve. This method brings out the character- 
istics of the candidate more clearly, I believe, than the artificial situation 
. making conversation from topics usually drawn from the application 

lank.’ 

“From the standpoint of the raters, this type of interview seems to 
offer many desirable features. It allows them an opportunity to examine 
visible characteristics of a candidate (clothes, manner, etc.) at greater 
length and more carefully than in an ordinary interview. It also calls 
into play the candidate’s adjustment to this type of situation. . . . An- 
other advantage of this type of rating is that it eliminates the necessity 
of a rater’s knowing anything about the candidate’s experience or educa- 
tion, the knowledge of which often colors a rater’s judgment of other 
qualities possessed by a candidate. . . .” 
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3. Another person wrote of disadvantages and limitations inherent 
in the use of the ink blots. 


“The interchange of ideas that verbal intercourse makes possible 
seems to me to be of the greatest value in judging or rating personality. 
It seems that both the oral examiners and the candidates are at a disad- 
vantage in this type of interview. I cannot see that it gives a person an 
adequate means of presenting his personality, although it may give the 
examiners the opportunity to determine a trait or two which more 
conventional interviews would not do. . . .” 


4. Two subjects thought that more comprehensive directions should 
have been given. 


“This type of interview would be more valuable if the candidate was 
aware of why he was being asked to interpret meaningless ink blots. The 
procedure makes the candidate feel ill at ease in that often it is really quite 
difficult to put any meaning into what he is asked to interpret.” 

“T think that the ratings in an oral interview using this technique have 
the advantage of being relatively free from bias that they would be subject 
to in an interview in which experience is discussed. However, to get a 
good display of personal traits, I think that more comprehensive instruc- 
tions should be given the subject. . . .” 


5. Two individuals commented on the public relations problems that 
would have been involved had this type of interview been used as part of 
a regular examination procedure. 


“. . . As far as substituting this type of interview for the conventional 
type which is associated with civil service examinations, I believe that 
this would be a very unwise move. The average candidate would not 
accept it as being practical and I think that public opinion would thus 
be alienated from the efficiency of the merit system principle.” 

“*. . . What the general public who heard of this would think of the 
Commission and would publish in newspaper columns or satirize in 
movies is still another problem. Certainly though, this is an interesting 
attempt to improve what up to now has been a weak point in the ex- 
amination procedure.” 


6. The other comments were concerned with general dissatisfaction 
with, or concern about, the candidate’s own behavior in the experimental 
interview situation. 

- The comments made by the three oral examiners after the fifteen 
experimental interviews were concluded constitute perhaps the most 
valuable data secured from this study. These comments are presented 
in verbatim form, labelled A, B, and C. 


A. “An interesting way of conducting an oral interview. It has con- 
siderable possibilities. If the candidate can forget the interview board 
and concentrate on examining the blots (and several did) the examiners 
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get a fine idea of the ‘workings of the candidate’s mind.’ It does give the 
examiners a fine opportunity to judge personal appearance without ob- 
viously staring at the candidate. Self-expression is well brought out 
because the candidate is given full opportunity and does not have to be 
thinking of an answer to a definite question. Surely alertness is well 
tested in this type of examination because the examiner can observe the 
candidate’s reactions to the various blots. I do feel that there is the 
possibility that the candidate is perhaps not put fully at ease with only 
one of the examiners giving the preliminary instructions and the other 
two remaining mute. However, I believe that for the purpose of this 
examination, and almost all others, a fair and accurate picture can be 
drawn of the several qualities on which the candidates are to be graded.” 

B. “Thinking of an interview a person will think of possible questions 
and answers. Being given something new, an examiner has a chance to 
see what that person will do when faced with a problem he has had no 
time to think through. In dealing with people, one has to change quickly 
from one type to another. This test gives the examiner a good idea what 
a candidate will do and shows the candidate’s reaction to change, which 
is sometimes the only difference between a good and bad employee.” 

C. “There appear to be restrictions upon the candidates’ opportunity 
to express themselves. This may be a good thing and eliminate unneces- 
sary and confusing talk as well as more nearly standardize the scope of 
expression heard by the oral board. I could not see that candidates were 
either more or less at ease than those I have seen in the conventional type. 
The rapport seemed to improve as each candidate became more accus- 
tomed to the situation, 7.e., after he had seen about four cards. 

“The attempt at so-called objectivity in the oral situation should, it 
seems to me, be in terms of the situation established for both the board 
members and the candidates. The simple instructions, the freedom of the 
board members from a responsibility of ‘keeping questions and answers 
flowing,’ the nearly identical situations for all candidates tend to increase 
the objectivity. For an opinion upon the validity, I would rather see 
some objective correlation made, if possible. The control of the situation 
and the more standard stimuli among candidates seem to offer a more 
adequate and fair basis for judging suitability. 

“T like the method for these off-hand reasons: (1) Freedom of board 
from burden of questions and attempts to secure juestions of so-called 
revealing nature. (2) Standard questions and attempts to secure subject 
matter for discussion at least on a par with ‘Why are you interested in 
this job?’ (3) Ease of administering. (4) Possibility of more valid data 
for test research. (5) Time consumed aspect is up to candidates, not a 
burden on the board nor a clock watching requirement on members’ part. 
(6) Information revealed seems to lend itself fairly well to check list form, 
more especially the traits Mental Alertness, Ability of Self-Expression, 
Manner and Bearing, and Adaptability. Appearance and Summary 
Evaluation would be obtainable on any method used. 

“The criticisms associated with standardized paper and pencil and 
miniature tests will be directed at this form of oral interview. Clear 
explanation to candidates stated in the bulletin and continued at the 
time of the interview can in part off-set this. To this end, I believe a few 
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additional instructions might be used; maybe only a sentence enlarging 
upon the fact that this is used instead of questions and is to evaluate 
certain factors required for the job and listed in the bulletin. 

‘“‘What are the reactions and suggestions of those taking the oral today? 
They should be interesting and valuable.” 


Statistical Data 


In the case of both the original and experimental interviews, the fifteen 
series of ratings of the three judges on each board were averaged for each 
trait, although the three ratings on each trait were considered as a sum. 
The rank-difference correlations between the ratings on each particular 
trait in the two interviews and the probable errors of these correlations 
are presented in Table 1, below. 


Table 1 


Rank-Difference Correlations between Average Trait Ratings of Original and 
Experimental Inverview Boards 





Trait rho PE(rho) 





ND, 0 asic nun nd wenads 60th 07 18 
BR dd 0-6 Kencaiie »acemalab onan c 18 
18 
18 
15 
ED 6.0 cv 5 xt00 onobtuawes 17 17 
EE ee er nee ay .03 18 





The correlations between the same traits in the different interviews, 
computed by the rank-difference method, range from .03 + .18 to 
41+ .15. The correlation of the trait of “Appearance,” .07 + .18, is 
significant in that it is statistically insignificant. Of all the traits meas- 
ured, it might be assumed, a priori, that the appearance of any candidate 
would be evaluated similarly by two comparable groups of raters regard- 
less of the difference in the nature of the interview which existed in this 
case. At least in regard to this trait, one must conclude that the two 
boards rated with measurably different biases. 

On the other hand, the rank-difference correlation between the two 
interviews on the trait called “Ability to Get Along Well with Others,” 
or “Congeniality,” was .41 + .15, the highest measure of agreement ob- 
tained. In spite of the relatively large probable error, this correlation is 
strikingly high in light of the fact that this trait is one of the most nebulous 
and vague of all the aspects of personality rated. 

The rank-difference correlations between the initial interview scores 
and the experimental interview scores, when but seven traits are con- 
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sidered (with the data for “Education,” “Experience,” and ‘Interest in 
the Work” omitted), between the written part of the examination and the 
initial interview scores, and between the written part of the examina- 
tion and the second or experimental interview scores were found to be 
non-significant. 

While there was no general significant agreement between the two 
boards, the indices showing the relative agreement among the raters 
themselves demonstrate which of the two groups displayed relatively more 
unanimity and agreement in its appraisals. These data are presented in 
Table 2, below, in the form of rank-difference correlations, and their 


Table 2 


Rank-Difference Correlations of Ratings among Examiners of Original and 
Experimental Interview Boards 





Original Interview Board Experimental Interview Board 


A&B B&C A&C X&Y Y&Z X&Z 








Trait rho PEp rho PEp rho PEp rho PEp rho PEp rho PEp 





Appearance... .74 09 .47 .14 .73 .09 50 .14 37 .16 49 «14 
Alertness. .... Se Be ww we S&S , ae oo oe a 
Expression.... .39 .16 .24 .17 45 .15 ai 646 8B 2 # ® 
Bearing...... 81 06 35 .16-—.02 .18 BS DB 6 2 02 . 
Congeniality.. .65 .11 48 .14 .73 .09 59 .12 85 05 .58 12 
Adaptability.. 65 .11 62 .11 .84 .06 45 16 SS. BD Aa 
Summary..... 74 08 6562 13 67 = «.10 47 14 83 06 .66 .10 





probable errors, among the three examiners of each oral interview board. 

In terms of the relative reliabilities of the two groups of raters, the 
correlations, and their measures of significance, shown in Table 2, reveal 
that (1) the order of magnitude of the correlations is about the same in 
the two groups; and (2) the correlations for both groups show both con- 
siderable variability among raters and an amount of agreement and dis- 
agreement with respect to traits that is usually expected from oral inter- 
view boards. That is, both boards behaved much the same in spite of 
the introduction of a drastically different method of interviewing. This 
fact would appear to minimize the error involved in the use of the two 
board technique. 


Discussion 


There are certain sources of error in the experiment as a whole that 
may be considered in the interpretation of the statistical data. 

1. The most serious error arises from the fact that the original board 
had an enormously greater range of experience in interviewing candidates 
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for positions in the class of Deputy Clerk-Deputy Marshal. This board 
worked together for a longer period of time and thereby set up important 
psychological relationships that could not be present in the experimental 
oral board. 

2. The fifteen subjects used in the experiment were a rather select 
group of the total number of 186 called to the initial interviews. All 
fifteen were employed at the time of the experiment by the Commission 
in a clerical-technical class, indicating that they had already successfully 
passed through a competitive selection process. 

3. The low order of the inter-board correlations on the several traits 
may be a commentary on the interview process itself, indicating that the 
same rating form and similar instructions to raters may not yield appre- 
ciably better results than might be expected from informal, unstandard- 
ized estimates. The facts are, however, that neither the total group nor 
the raters in the two interviews were the same. 

4. It might be expected that the correlations between ratings on the 
same traits in the two interviews would tend ‘ be low because of the 
deletion in the experimental interview of three traits discussed and rated 
in the initial interview, which might have influenced other ratings made 
at that time. These traits were “Evaluation of Education,” “Evaluation 
of Experience,” and ‘‘Interest in the Work’”’—the three which obviously 
could not be rated from responses to ink blots. In the first interview, 
the ratings on “Education” and ‘‘Experience’”’ for the fifteen candidates 
subsequently re-interviewed, were generally high, while the ratings on 
“Interest”’ were strikingly low. 

5. Although the data presented in Table 2 might be taken to indicate 
otherwise, the difference in the kind of interviews would seem to be an 
operative variable. The first interview was of the type that might be 
described as conversational or social, in that the verbal responses were in 
the nature of continual stimulation and response among the candidate 
and the raters. The experimental interview, on the other hand, employed 
not this “circular” type of response that engenders a psychological barrier 
between the interviewer and the interviewee, and which characterizes the 
true interview, but was more “linear’’ and mechanical, in that the verbal 
responses made during the interview were almost entirely from the candi- 
dates to the raters. In the second or “observational” interview situation, 
the same traits had to be appraised on the basis of manifestations of 
different qualities of response than those employed in the original inter- 
view, as there were few of the nuanées of behavior that come from a social 
conversation. 
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Suggestions 


With the possibility of some future experimentation along this line in 
mind, these major changes of procedure may be suggested: (1) The need 
of a large enough number of candidates to give significance to the statis- 
tical results is evident. (2) The use of the same oral raters for both the 
initial and experimental interviews is recommended. While it might then 
be possible that the raters’ judgments in the second interview would be 
influenced by their ratings in the first, the uncontrollable variable intro- 
duced by having “‘comparable”’ interview boards would be eliminated. 














Errata 


Teegarden, L. Manipulative performance of young adult applicants at 
a public employment office. J. appl. Psychol., 1942, 26, 633-652, 
754-769. 


Page 641, line 30. Correct to read: with two exceptions, Placing and 
the first or practice half of Plier Dexterity. (For comment on Placing, 
see correction for pp. 760 and 769, below.) 

Page 760, lines 10-14. For these three sentences substitute: The 
difference between means for men and women on Placing is more than 
four times its standard deviation, and for Turning slightly less than half 
its standard deviation. This is a significant difference for Placing but not 
for Turning. The Educational Test Bureau reports norms which are 
practically identical for both sexes. 

Page 769, lines 13-17. Substitute: 4. The sex difference was not sig- 
nificant for Spatial Relations nor for Turning or for the last half of the 
Plier Dexterity test. It was significant for Placing and for the first half 
of the Plier test. Possible reasons for the significant differences are 
discussed. 





News and Notes 


At the January 15-18, 1943 meeting of the officers and committee 
chairmen of the Council of Guidance and Personnel Associations, the war 
manpower problem as related to counseling and personnel work was 
discussed. Dr. E. G. Williamson, chairman of the advisory committee 
to the Education Branch of the Special Services Division, U. 8. Army, 
revealed that a plan for occupational rehabilitation of soldiers is now 
being considered by the Army. The proposal contemplates the appoint- 
ment of guidance officers who will interview, test, and counsel servicemen 
in terms of education and employment after their demobilization. Men 
now engaged in Army personnel classification or assignment or as Army 
psychologists will be selected for the work of post-war guidance. 





Two new forms, A and B, of the Wonderlic Personnel Test are an- 
nounced to supplement forms D, E, and F which have been available for 
several years. The Psychological Corporation will act as distributors. 





Foster Parents’ Plan for War Children, Inc., 55 West 42nd Street, 
New York, New York announced the issuance of a report issued monthly 
on the psychological care of children in England by Miss Anna Freud 
and Dorothy T. Burlingham. The subscription cost is $10.00 per year. 





Michael Reese Hospital announces that Dr. 8. J. Beck (Ph.D.) will 
offer his usual course this year in the Rorschach test. Accent will be on 
those less serious mental disturbances in which success in treatment 
appears possible. The differentiating patterns of the test, in those 
patients, will be studied from full response records, and contrasted with 
those found in more serious conditions. The course will be in session 
two two-hour periods daily for five days, June 7-11, 1943, inclusive. 
Interested persons are invited to communicate with the Department of 
Neuropsychiatry, Michael Reese Hospital, Chicago. 





The January, 1943, issue of Occupational Psychology contains an in- 
formative article by Dr. C. S. Myers of the National Institute of Industrial 
Psychology describing the work of the Directorate of Selection of Per- 
sonnel in the British Army. 





Selective Service Occupational Bulletin No. 11 issued under date of 


March 1, 1943, fails to list psychology as a shortage field and therefore 
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psychology is no longer a deferrable field for students in training. This 
bulletin replaces Occupational Bulletin No. 10 as amended on December 
14, 1943, and earlier issues in which psychology was recognized as a 
shortage field. 





The War Manpower Commission’s list of essential occupations in 
essential industries fails to list Clinical Psychology among the thirteen 
occupations listed in the field of Health and Welfare. Among the essen- 
tial occupations listed in this essential industry are physician, dentist, 
dental hygienist, dietician, medical technician, and podiatrist (chi- 
ropadist). 





The Research Council on Problems of Alcohol, Bronxville, New York 
announced a $1,000 award for outstanding research on alcoholism during 
1943. Reports of such research must be submitted on or before February 
15, 1944. The Council will send on request, to any scientist, an outline 
of basic policies governing its research program, and the conditions govern- 
ing the award. The Council is an associated society of the American 
Association for the. Advancement of Science. 





Dr. Albert D. Freiberg, Technical Director of Marketing Research, 
The Psychological Corporation sends the following news note: 

“Since 1935 there have been a series of Annual Advertising Awards 
sponsored by Advertising and Selling to continue the Harvard Advertising 
Awards sponsored by Edward Bok from 1924 to 1930. This year the 
Psychological Corporation received the medal award ‘For an original 
research development conducted by an independent individual or or- 
ganization, not designed or used directly for the promotion of any media, 
product, or service.’ This award was given for “Two Studies of Public 
Sentiment Toward Wartime Advertising’ for the Association of National 
Advertisers. These studies were two of the regular studies in which a 
large number of psychologists in different universities and colleges super- 
vised field work. They were conducted by the Marketing Research 
Division of the Corporation, under the direction of Dr. Henry C. Link.” 





The Medical Correctional Association, an affiliate of the American 
Prison Association, is interested in establishing contact with all profes- 
sional personnel who are especially concerned with, or interested in, the 
medical aspects of crime. Physicians, Social Welfare Workers, Psycholo- 
gists and other professional workers in the field of criminology are 
eligible for membership. Annual dues are one dollar. Application 
inquiries should be directed to Dr. Robert M. Lindner, United States 
Penitentiary Hospital, Lewisburg, Pennsylvania. 





Book Reviews 


Klopfer, B., and Kelley, D. M. The Rorschach Technique. Yonkers-on- 
Hudson, New York: World Book Company, 1942. Pp. x + 436. 


This book is a manual for the administration, scoring, and interpre- 
tation of the Rorschach method of personality diagnosis. It describes 
the method as used by the majority of workers in this country, and 
deviates in some respects from the original formulations of H. Rorschach 
in his Psychodiagnostik. 'These changes have been mainly in the direction 
of multiplying and making more explicit the scoring categories; Ror- 
schach’s basic assumptions, the theory of personality structure, and the 
interpretative significance of the categories remain relatively unchanged. 

The book is divided into four parts, the first three of which were 
written by Dr. Klopfer. The first consists of a history of the method and 
a consideration of methodological problems including a discussion of the 
“objectivity” of projective tests and the role of statistics in their valida- 
tion and standardization. 

Part II (about a third of the book) describes the technique of admin- 
istration and the scoring system. The responses of the subject are scored 
for location, determinants (form, movement, color, and shading), and 
content. For most of the categories the criteria for scoring are carefully 
enough defined so that the agreement between trained scorers should 
approach that on, say, the vocabulary items of the Revised Stanford- 
Binet. 

Part III consists, first, of definitions of the variables which the method 
purports to describe, e.g., introversion and extraversion, control (con- 
striction and dilation), emotional responsiveness, the mental approach, 
anxiety, maturity, etc., and second, of statements of the significance of 
the scoring categories with respect to these variables. A scoring category 
does not have a single interpretational significance; it may have different 
meanings in different records, depending on the values in each of the other 
categories. That every record has unique configurational aspects is con- 
tinually emphasized, but in spite of these recurring holistic strictures, 
Klopfer often seems to assume a one-to-one correspondence between a 
scoring category and a personality variable. For example, he says, 
*. . . the proportion of usual details may indicate the awareness of the 


obvious and the immediate problems of everyday life . . .”’ (p. 203), and, 

“Plain diffusion responses (K) and toned-down shading effects (k) in- 

variably indicate insecurity and anxiety of the free-floating type” (p. 241). 

More commonly, though, the multiple interpretational possibilities of the 
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variables and their rations are discussed against the background of dif- 
ferent kinds of records. 

It is unfortunate that this section contains no validating information 
of the kind usually found in the descriptions of more objective tests. 
Instead the reader must be satisfied with such statements as, ‘‘A pre- 
dominance of human profiles, based on parts of the contours of all the 
blots, has been found in cases of anxiety neuroses, combined with a 
strongly introversial personality constellation” (p. 264), and, “‘It is one 
of the best proved and validated assumptions that the percentage of FC 
[fusion of form and color] indicates the degree of emotional adjustment 
to outer reality’’ (p. 282). Most of the interpretations assume a great 
deal more stimulus-equivalence and response-generalization than most 
experimental psychologists would be willing to grant without extensive 
empirical evidence. It is true that validating the interpretative sig- 
nificance of fifty to a hundred scoring categories, the meaning of each 
of which may vary with a change in the value of each of the others, 
presents a formidable task, logically and practically, and, indeed, some 
Rorschachists would claim, a work of supererogation. There are, how- 
ever, many indications of the validity of the Rorschach in the clinical 
and experimental literature, and this section could have been made more 
convincing to the general reader and more useful to clinical and research 
workers if citations from these studies had been included. Such facts 
are an integral part of a technical manual. 

In the last section Dr. Kelley summarizes the Rorschach findings in 
intracranial organic pathology, dementia praecox, the psychoneuroses, 
mental deficiency and other clinical conditions. His task was a difficult 
one, for not all investigators have used the same scoring system, and 
criteria for the diagnosis of these conditions vary from one hospital and 
clinic to another. Much of this research has been done with “signs,” 
scoring categories or aspects of the whole record which differentiate 
among clinically diagnosed groups and normals. The methodological 
problem here is simpler than in personality description. The question 
of the meaning of a sign does not necessarily arise; a sign is significant 
if it appears more frequently in one group than it does in another. In- 
dividual diagnosis consists in essence of counting the number of signs 
typical of each condition, giving weight to those with the highest dis- 
criminative value. It is to be hoped that further research will employ 
more objective weighting methods than have been used to date. It is 
possible that a statistical method such as discriminant function which 
maximizes the predictive value of a series of indices, no one of which by 
itself is statistically significant, may provide better group differentiation 
and hence more accurate individual diagnosis. 
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Klopfer and Kelley have made a significant contribution to the study 
of personality. In offering a scheme for analyzing the projective material 
of the Rorschach which may well become standard, and summarizing 
what is known about interpretation and diagnosis, they have provided 
clinicians with an immediately useful tool. Research workers will be 
intrigued by the perplexing problems of validation, and they will be 
forced to use (or invent) more elaborate experimental designs than the 
simple measures of association used in validating tests of nomothetic 
abilities and traits. The field of personality measurement has been too 
long dominated by over-simplified conceptualizations of personality 
structure derived from methods successful in other areas of measurement; 
we have thought of personality as being made up of those things which 
we had the tools to measure and validate. It is good that Klopfer and 
Kelley have stated their conceptualization so clearly. It makes obvious 
the need for new methods. 


Rosert E. Harris 
The Langley Porter Clinic 


University of California Medical School 


Hollingworth, LetaS. Children above 180 IQ Stanford-Binet. Origin and 
development. Yonkers-on-Hudson, New York: World Book, 1942. 
Pp. xviii + 352. 

The manuscript of this book, which was left unfinished at the time of 
Dr. Hollingworth’s death in 1939, was completed by her husband, Prof. 
H. L. Hollingworth, who, in a brief Foreword, describes the stage to 
which the material had been brought at the time of Leta Hollingworth’s 
death and the amount of editorial revision and extension carried out by 
him. 

The book is divided into three parts. The Preface and the 3 chapters 
comprising Part I were all completed before the author’s death. The 
first chapter, entitled ‘‘The concept of intellectual genius” presents an 
interesting summary of various points of view as to the nature of genius, 
the characteristics of geniuses and the attitudes shown toward them by 
society. The misunderstanding and persecution which has frequently 
been their lot is particularly stressed. Of the two remaining chapters, 
one is devoted to a summary of the literature on eminent adults, the 
other to an account of earlier published reports of the development and 
behavior of individual children who tested above 180 IQ, together with 
a small number, reported before the era of Binet testing, who presumably 
would have ranked at this level. 

In Part II, individual case histories of 12 children, not previously 
reported in the literature, are presented. As far as possible the data for 
each child are discussed under the following heads: (1) family background, 
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(2) preschool history, (3) school history, (4) judgments of teachers, (5) 
mental measurements, (6) traits of character, (7) physical measurements 
and health, (8) miscellaneous characteristics, and (9) later development 
and accomplishments. The data given vary, however, from child to 
child, either because information on certain topics was lacking or because 
of the youth of the child at the time of report. In commenting on this 
material, H. L. Hollingworth notes that only 5 of the 12 cases had been 
formally written up before the author’s death, and that much information 
possessed by her about these cases as well as about the remaining 7 whose 
reports have been collated by him from data left on file, was never made 
a matter of written record and is thus irrecoverable. As they stand, the 
biographies are similar to others dealing with highly exceptional children. 
They are replete with anecdotes illustrating the remarkable gifts of reason- 
ing and linguistic skill possessed by these youngsters at very early ages. 
At the close of this section, two chapters by H. L. Hollingworth summarize 
the data on heredity and early development, scholastic achievement and 
creative activity. The difficulties of educational adjustment encountered 
by a child whose mental age is almost twice his chronological age are 
pointed out. The conclusion is reached that best adaptation is usually 
achieved when the highly gifted child is identified early and special edu- 
cational arrangements are made whereby only a part of his school time 
is given up to the formal subjects of the curriculum with the remainder 
left free for individual work along lines determined by his personal inter- 
: ests and initiative. 

The five chapters included under Part III, General principles and 
implications, are reprinted from earlier publications of Leta 8. Holling- 
worth. In the first of these, Hollingworth questions the appropriateness 
of the term “‘genius’’ as applied to children of 140-150 IQ, a level which, 
she states, designates only about the 75th percentile of college graduates. 
She suggests that the term might better be reserved for those ranking at 
or above the 180 IQ level. The second and third papers deal with the 
personality ‘characteristics of children in this intelligence range. The 
difficulty commonly experienced by them is finding congenial companion- 
ship and their consequent tendency to become isolated and lonely is 
pointed out. Because so few of these children completely overcome this 
handicap, the author is of the opinion that, as far as personal happiness 
is concerned, the optimal intellectual level is more likely to fall within the 
range of 130-150 IQ than at the maximum. The loss suffered by society 
from this failure to help the highly gifted child to become a truly socialized 
being is stressed. 

The remaining papers deal with the education and training of excep- 
tionally bright children. Their general tenor is weil summed up in the 
concluding paragraph (pp. 321-2). 
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“More and more it becomes clear that human welfare on the whole is 
much more a matter of the activities of deviates than it is a matter of what 
the middle mass of persons does. Those educators who make a joke of 
the genius and regard the dullard as a mere figment of the imagination of 
psychologists, or who solve the educational problems which these children 
present by the simple device of ‘‘not believing in them,” fiddle while 
Rome burns. It is the deviate who takes the initiative and plays the 


primary part in social determination. How shall we, then, educate him 
in a democracy?” 


FLORENCE L. GoopDENOUGH 
University of Minnesota 


Seashore, Robert H. Fields of psychology: An experimental approach. 
New York: Henry Holt & Company, 1942. 


The title Fields of Psychology: An Experimental Approach, indicates 
that experimental methods have now been extended to cover all of the 
special fields of psychology. As evidence we shall present a series of 
descriptions of modern research which will include all gradations of 
methods from the classical single variable type of problem to the latest 
multiple variable types such as are used in factorial analyses. We shall 
also introduce the reader to the psychologists’ various places of work, 
from the sound-proof, electrically shielded darkroom laboratories to the 
schools, clinics, industries, studios and playing fields which are often 
more representative situations for the study of behavior in everyday 
life.” (Preface.) 

The plan of the book arose from the experiences of the Editor in visits 
to laboratories, clinics and personnel offices of colleagues at other insti- 
tutions. So profitable were these visits that he decided to extend these 
experiences “‘to students by securing a still more representative sample 
of current psychological investigations in each of the special fields and 
by having an expert in each field interpret the methods and findings in 
print.” 

The collaborative plan of the book is indicative of “‘the remarkable 
degree of specialization in psychology.” In fact, the author rightly 
remarks that “even other psychologists need a series of experts to keep 
them in touch with the rapid progress of research in the various fields.”’ 
Furthermore, ‘‘one of the marks of a maturing science is the more frequent 
appearance of multi-authored articles and books, a trend which is very 
noticeable in present-day psychology.” 

This book should not be confused with other volumes entitled Fields 
of Psychology, because it does not attempt to list all of “‘the principal 
sub-topics or to give a bird’s eye view of the whole field,” but attempts 
to ‘‘show in considerable detail just how some of our major investigations 
have developed, . . .” 
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The editor has chosen nine fields in which to present representative 
investigations by a chosen expert in each field. 

In Part One, Introduction, the Editor briefly characterizes each of the 
eleven fields chosen. Claude E. Buxton presents five chapters in Part 
Two, General Experimental Psychology. For the first chapter, The 
Frame of Reference, he uses the material from M. Sherif’s The Psychology 
of Social Norms and the same author’s A Study of Some Social Factors in 
Perception. The main points are presented in a clear and straightforward 
manner and the vocabulary is certainly within the grasp of a college 
student. In an interesting manner the author builds up a concept of a 
frame of reference and succintly describes the experimental technique 
and results. 

In Part Three, Donald B. Lindsley presents Physiological Psychology 
in five chapters: Sensory and motor action, Motor nerve action, Experi- 
ments in audition, Cortical function, and Electro-encephalography. The 
facts given under these headings are gleaned from relatively recent 
investigations, some of which are the author’s own work. 

Part Four, Comparative Psychology, has three chapters by Harry F. 
Harlow in which he presents the results of studies of transfer and dis- 
criminative behavior in monkeys, the inheritance of wildness and savage- 
ness in rats, the behavior and social relations of Howling monkeys and 
cooperative solving of problems by young chimpanzees. Comparative 
psychology is so saturated with accounts of the learning behavior of rats 
and the significance of hens pecking grains of corn from gray paper, that 
a little disappointment is felt at their absence; however, this is not meant 
as a criticism. The material has added interest because of its degree of 
similarity to human behavior. 

Beth L. Wellman presents the materials in Part Five, Human De- 
velopment. The question as to the effect of the environment upon the 
IQ is re-emphasized. After two chapters, the author concludes: “It 
seems to the writer that the only explanation that logically applies to all 
of these results is that the IQ changes when changes are made in the 
environment, and that when the environment differs markedly, over an 
extended period of time, from that experienced by children in general, the 
IQ is correspondingly affected.” 

Another chapter presents data concerning the changes which training 
produces in the personality and social behavior of children. This section 
also includes McGraw’s study of Johnny and Jimmy, and a chapter on 
the correlation of mental and manual abilities with age. 

Only an enumeration of the remaining fields and the names of the 
collaborators is possible. There are four chapters in Part Six, Educa- 
tional Psychology, by Dael L, Wolfle. These chapters are concerned 
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with primary mental abilities, learning to read, the transfer of training, 
and the Chicago college plan. Representative data in the fields of voca- 
tional guidance and industrial psychology are presented in Parts Seven 
and Eight respectively by E. G. Williamson and Harrison Musgrave. 
The section on avocational psychology, Part Nine, is written by Harold 
G. Seashore. Part Ten is concerned with social psychology, written by 
Paul R. Farnsworth. Edmund 8. Conklin writes Part Eleven, Abnormal 
and Clinical Psychology, in three chapters, and one of these is concerned 
with experiments with clairvoyance or extra-sensory perception. The 
reviewer feels that this part could have been made more valuable. There 
is an abundance of data in the clinical and abnormal fields. 

The Editor finishes the book in Part Twelve under the title of Sys- 
tematic Psychology. The point of view is that the various schools, 
points of view, or theories are supplemental, each contributing its own 
share of data to the understanding of psychology as a science, not as a 
conglomeration. In the last chapter, the general trend of experimental 
psychology, is briefly summarized under the subtopics of sensory prob- 
lems, affective processes, motor processes, and thought processes. 

This is an exceptional book. It is written in such an interesting style 
that one wants to finish it at one reading. Practically any reader will 
find a few or several fields of special interest. The student will be con- 


vinced that the varied problems in psychology are attacked with exacting 
scientific rigor. 


The format of the book deserves special commendation. 
J. R. Gentry 


Ohio University 
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